We ran into an issue with our eventstore deployment yesterday where it started reporting warning about extremely slow queues. It eventually stopped accepting requests and had to be restarted.
When it came back online it had to rebuild the read index which took a considerable amount of time (roughly 60 mins for about 30m items)
I noticed that the eventstore process started consuming a large amount of memory and eventually started paging aggressively which I assume is what caused the perf issues.
We are running on version 3.9.3. We have an automated task which runs a scavenge every evening at 9PM. The issue happened at 5:30PM.
I have attached the error log and stats. Any idea what could have caused this?
10.10.0.101-2113-cluster-node-err.log (50.2 KB)
10.10.0.101-2113-cluster-node-stats.csv (6.96 MB)