So over the last few weeks, EventStore has just flat out stopped working. Not much in the messages on the client side other than connections are still dropping (as per usual). When we tried to log in, we couldn’t. The eventstore login page would show but the button wouldn’t sign us in. It wouldn’t do anything. No error, no progress, nothing. So I tried to log into one of the other nodes and I was shown this message:
System.IO.IOException: Too many open files
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.Boolean anonymous, System.IO.FileOptions options) [0x0025f] in <8f2c484307284b51944a1a13a14c0266>:0
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share) [0x00000] in <8f2c484307284b51944a1a13a14c0266>:0
at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare)
at System.IO.File.OpenRead (System.String path) [0x00000] in <8f2c484307284b51944a1a13a14c0266>:0
at System.IO.File.ReadAllBytes (System.String path) [0x00000] in <8f2c484307284b51944a1a13a14c0266>:0
at EventStore.Core.Util.MiniWeb.ReplyWithContent (EventStore.Transport.Http.EntityManagement.HttpEntityManager http, System.String contentLocalPath) [0x001a6] in <82591026fe824176b191ced935e5b4b0>:0
This doesn’t sound like something we would have done, and we’ve done minimal configuration.
I was able to SSH into one of the nodes and pull the error log. Here’s one of the errors.
[PID:91259:006 2018.05.15 13:00:37.953 ERROR Application ] Exiting with exit code: 1.
Exit reason: Verification of chunk #143-143 (chunk-000143.000000) failed, terminating server…
[PID:91679:008 2018.05.15 13:09:25.681 ERROR Application ] Exiting with exit code: 1.
Exit reason: Verification of chunk #172-172 (chunk-000172.000000) failed, terminating server…
[PID:91751:016 2018.05.15 13:09:31.727 ERROR TableIndex ] ReadIndex is corrupted…
EventStore.Core.Exceptions.CorruptIndexException: Error while loading IndexMap. —> System.IO.IOException: Too many open files
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.Boolean anonymous, System.IO.FileOptions options) [0x0025f] in <8f2c484307284b51944a1a13a14c0266>:0
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.IO.FileOptions options) [0x00000] in <8f2c484307284b51944a1a13a14c0266>:0
at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,int,System.IO.FileOptions)
at EventStore.Core.Index.PTable+WorkItem…ctor (System.String filename, System.Int32 bufferSize) [0x00006] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.PTable+c__AnonStorey1.<>m__0 () [0x00000] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.DataStructures.ObjectPool1[T]..ctor (System.String objectPoolName, System.Int32 initialCount, System.Int32 maxCount, System.Func
1[TResult] factory, System.Action1[T] dispose, System.Action
1[T] onPoolDisposed) [0x000d1] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.PTable…ctor (System.String filename, System.Guid id, System.Int32 initialReaders, System.Int32 maxReaders, System.Int32 depth, System.Boolean skipIndexVerify) [0x00122] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.PTable.FromFile (System.String filename, System.Int32 cacheDepth, System.Boolean skipIndexVerify) [0x0000c] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.IndexMap.LoadPTables (System.IO.StreamReader reader, System.String indexmapFilename, EventStore.Core.Data.TFPos checkpoints, System.Int32 cacheDepth, System.Boolean skipIndexVerify) [0x0007d] in <82591026fe824176b191ced935e5b4b0>:0
— End of inner exception stack trace —
at EventStore.Core.Index.IndexMap.LoadPTables (System.IO.StreamReader reader, System.String indexmapFilename, EventStore.Core.Data.TFPos checkpoints, System.Int32 cacheDepth, System.Boolean skipIndexVerify) [0x00110] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.IndexMap.FromFile (System.String filename, System.Int32 maxTablesPerLevel, System.Boolean loadPTables, System.Int32 cacheDepth, System.Boolean skipIndexVerify) [0x00066] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.TableIndex.Initialize (System.Int64 chaserCheckpoint) [0x000a2] in <82591026fe824176b191ced935e5b4b0>:0
System.IO.IOException: Too many open files
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.Boolean anonymous, System.IO.FileOptions options) [0x0025f] in <8f2c484307284b51944a1a13a14c0266>:0
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.IO.FileOptions options) [0x00000] in <8f2c484307284b51944a1a13a14c0266>:0
at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,int,System.IO.FileOptions)
at EventStore.Core.Index.PTable+WorkItem…ctor (System.String filename, System.Int32 bufferSize) [0x00006] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.PTable+c__AnonStorey1.<>m__0 () [0x00000] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.DataStructures.ObjectPool1[T]..ctor (System.String objectPoolName, System.Int32 initialCount, System.Int32 maxCount, System.Func
1[TResult] factory, System.Action1[T] dispose, System.Action
1[T] onPoolDisposed) [0x000d1] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.PTable…ctor (System.String filename, System.Guid id, System.Int32 initialReaders, System.Int32 maxReaders, System.Int32 depth, System.Boolean skipIndexVerify) [0x00122] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.PTable.FromFile (System.String filename, System.Int32 cacheDepth, System.Boolean skipIndexVerify) [0x0000c] in <82591026fe824176b191ced935e5b4b0>:0
at EventStore.Core.Index.IndexMap.LoadPTables (System.IO.StreamReader reader, System.String indexmapFilename, EventStore.Core.Data.TFPos checkpoints, System.Int32 cacheDepth, System.Boolean skipIndexVerify) [0x0007d] in <82591026fe824176b191ced935e5b4b0>:0
[PID:91751:016 2018.05.15 13:09:31.734 ERROR TableIndex ] IndexMap ‘/var/lib/eventstore/db/index/indexmap’ content:
000000: 35 34 43 30 34 36 45 30 42 41 37 36 39 37 32 43 | 54C046E0BA76972C
000016: 41 31 44 36 37 32 32 45 30 45 34 39 35 33 33 38 | A1D6722E0E495338
000032: 0A 31 0A 38 36 35 39 33 32 35 31 37 37 2F 38 36 | .1.8659325177/86
000048: 35 39 33 32 35 31 37 37 0A 30 2C 30 2C 65 39 64 | 59325177.0,0,e9d
000064: 36 36 30 61 38 2D 34 31 61 62 2D 34 36 32 38 2D | 660a8-41ab-4628-
000080: 39 33 61 38 2D 30 61 62 34 30 31 31 62 31 63 64 | 93a8-0ab4011b1cd
000096: 63 0A 31 2C 30 2C 36 32 63 62 34 39 63 32 2D 39 | c.1,0,62cb49c2-9
000112: 39 32 36 2D 34 39 63 38 2D 38 39 63 34 2D 34 64 | 926-49c8-89c4-4d
000128: 66 31 64 35 32 34 33 34 61 65 0A | f1d52434ae.
There are others like it but the hex dumps are different.