Competing Consumers with $by_category projection not working if ES needs to restart

Hi!

I am working on a persistent subscription from a $ce-aggregate stream (created by the $by_category projection).

For now I am running ES on a single node and I have one consuming service.
When the client goes down, everything works fine restarting it.

When the ES node goes down I’m screwed. The once persistent subscription restarts at event 0 and my client consumes all events once more.
I am aware that I should not run ES on just one node, but this is the first of a few stress tests, with the possibility of a data center going down.

Here is the output of the ES node starting:
17:29:18 eventstore.1 | Subscription $ce-aggregate::MyService: read no checksum.
17:29:18 eventstore.1 | strtfrom = 0
17:29:18 eventstore.1 | Subscription $ce-aggregate::MyService: read no checksum.
17:29:18 eventstore.1 | strtfrom = 0

Any ideas?

What are your setting on the subscription? it looks like no checksum
exists so has never been saved.

PersistentSubscriptionSettings

  .Create()

  .StartFromBeginning()

  .ResolveLinkTos()

  .MinimumCheckPointCountOf(1)

  .MaximumCheckPointCountOf(5)

In the beginning I tried with the three first rows. I ended up adding the two last ones when I couldn’t get the results I wanted.

Is this reproducible at your end?

See under configuration
http://docs.geteventstore.com/server/3.1.0-pre/competing-consumers-config/
there is a timing as well.

Yes, I tried that one too, but it didn’t make any difference. I figured the two i have set were enough?
I haven’t tried setting the buffer sizes yet, since the docs mention defaults.

Buffer sizes are not important for checkpointing. If you want to send
repro code I would be happy to run it.

The way checkpointing works is:

There is a timing (let's say every n seconds). When the timing gets
hit, it will checkpoint if minimum has been hit. It will also
checkpoint if maximum ever gets hit.

Cheers,

Greg

All right, so the timing is mandatory for the other two configurations to work?

If I have autoAck = true, the acknoledgement should also trigger a checkpoint, right?
There are no errors thrown and the subscription isn’t going down at any time other than when I manually shut it down.

I’ll get the code to you asap.

Thanks

The checkpoints will be written in the method described in the
document above. The configuration points describe when checkpoints
should be written. It is quite unlikely that you want a checkpoint
written on every acknowledgement.

Thanks for the clarifications!
Yes, that would be time consuming, but perhaps necessary for us.

I changed my MaximumCheckPointCountOf to 1 and finally I’m getting some checkpoints.
PersistentSubscriptionSettings

  .Create()

  .StartFromBeginning()

  .ResolveLinkTos()

  .CheckPointAfter(TimeSpan(0,0,10))

  .MinimumCheckPointCountOf(1)

  .MaximumCheckPointCountOf(1)

14:22:55 eventstore.1 | [43779,04,12:22:55.256] publishing checkpoint 1
14:22:55 eventstore.1 | [43779,24,12:22:55.259] state write successful

When I restart the node, the client always receives the last event in the subscription stream even if a checkpoint has been set there.
Is that intended behaviour?

So what about the last message being re-sent by the subscription on node startup?
Is this intended behaviour?
Looking at the timing in my logs, it almost looks like a race condition between my subscriber and the stream checkpoint being loaded.

Best regards,
Mikko