Sudently database stopped working

We use EventStore V5 in k8s cluster + AWS (EFS file system). This used to work in production well until this morning when ES came up with the issue:

[00001,01,06:19:50.193] Opened ongoing “/var/lib/eventstore/chunk-000115.000000” as version 3
[00001,08,06:19:50.987] Verifying hash for TFChunk ‘"/var/lib/eventstore/chunk-000114.000000"’…
[00001,01,06:19:50.990] CACHED TFChunk #115-115 (chunk-000115.000000) in 00:00:00.0026162.
[00001,01,06:19:51.011] Unhandled exception while starting application:
EXCEPTION OCCURRED
Log record at actual pos 0 has non-positive length: 0. in chunk #115-115 (chunk-000115.000000).
[00001,01,06:19:51.034] “Log record at actual pos 0 has non-positive length: 0. in chunk #115-115 (chunk-000115.000000).”
EXCEPTION OCCURRED
Log record at actual pos 0 has non-positive length: 0. in chunk #115-115 (chunk-000115.000000).

How can we start up EvenStore again?

This used to work in production

Don’t confuse “luck it hasn’t broken yet” with “working”! Given that EFS is just NFS, this was somewhat inevitable - it has no place in production for a database since while NFS presents a file-system-like model, it does not respect the semantics of local file systems.

Your database will need manual intervention to recover unless you have a recent backup from which you can restore, and there may be data loss. If you need help recovering the original database files Event Store Ltd can likely assist as a (paid) professional service. If this is of interest, you can use the “contact us” link on eventstore.com.