EventStore 3.9.2 - Projections have stopped having events added to it,

Hi,

We’ve got a very old version of EventStore running (3.9.2) and our projections have stopped processing events. We can see events being added to some projections but not all.

Any tips for debug for this to work out what might have gone wrong?

Totally understand we should have perhaps upgraded to a more recent version…

yes :smile:

System or custom projections ?
I remember having kind of the same issue on a 3.x version and needed to restart the cluster ( node by node) to get it working again, though there was an error in the ui & the logs

It was a custom projection, we haven’t tried restarting yet as it usually takes between 1-2 hours to restart and we’re worried that maybe it could cause more issues but likely to be a next step as we can’t work it out currently.

There aren’t any issues on the UI we can see or in the logs which is causing us the confusion.

@yves.lorphelin

We have found some errors from around the time actually:

[PID:01125:023 2023.03.19 22:35:56.798 ERROR QueuedHandlerAutoRes] —!!! VERY SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 18654ms. Q: 0/0.
[PID:01125:4631 2023.03.19 22:36:12.879 ERROR QueuedHandlerThreadP] —!!! VERY SLOW QUEUE MSG [StorageReaderQueue #2]: ReadStreamEventsForward - 16058ms. Q: 0/10.
[PID:01125:4632 2023.03.19 22:36:12.938 ERROR QueuedHandlerThreadP] —!!! VERY SLOW QUEUE MSG [StorageReaderQueue #3]: ReadStreamEventsForward - 19638ms. Q: 0/19.
[PID:01125:4580 2023.03.19 22:36:12.950 ERROR QueuedHandlerThreadP] —!!! VERY SLOW QUEUE MSG [StorageReaderQueue #1]: ReadStreamEventsForward - 14452ms. Q: 0/4.

We’re only seeing the issues on some projections where certain events are involved so we’re trying to identify whether there are certain events that might be causing it to break, but not getting far with the debug right now.

the slow msg queue indicates increased pressure on the server.
are you running a sincle node or cluster ?

We are running a single node.

Also, when we restarted a copy of our EventStore we’re getting this where the projections won’t start up now. We’re looking to get some logs for this now, but there isn’t much to go off other than SLOW QUEUE MSG and SLOW BUS MSG. We’ve quadrupled the size of the box where it is deployed also which makes it start up quicker but doesn’t allow the projections to enable successfully.

the screenshot does not show any obvious problem ( looking at status )
except that it seems to takke quite a long time to start them.
That indicates a lot of back work on the part of the projections.

I would really suggest trying out on a newer version of the DB ( 21.10.x or 22.10.x )
The database got quite some improvements in term of memory / thread & IO and general performance as well as a new javascript engine ( not V8 anymore , a fully managed one )

TCP protocol for client is still available on those version ( needs to be enabled though)