How does the live buffer and paging work in Persistent Subscriptions

We’re looking at the configuration of a persistent subscription and we see these 3 settings:

  • Live Buffer Size
  • Buffer Size
  • Read Batch Size
    It feels like these settings are quite important for busy streams that need to handle a lot of events.

Are we correct that the “live buffer” is an in memory buffer for events in the subscription. When your client reads 100 of them, then the buffer will fetch a new set of events from memory and place them in the buffer. When your client consumes events too fast, then it’ll drain the buffer and you’ll resort to paging.

I think paging here means that the events are read from disk?

But then things become more difficult to understand.

In paging there also is a buffer, is this also in memory, and how does this compare to the “live buffer”? And what does the “read batch size” mean?

Basically we’d like to find out what these things are and how to find good values for them.

Also, how do we check that we have proper values, can we see how often it resorts to paging?

So to understand these three value we need to look at how a subscription works in terms of IO. The subscription “buffers” a certain number of messages that are yet processed at maximum. Consider the case that 86 messages occur within 100 ms. These messages are held in the live buffer. If the live buffer fills (or is in general too far behind) then the subscription will fall back to reading from the stream. Once caught up it will switch back to a live subscription.

The correct value for the buffer size depends much on the usage patterns. Are events written one at a time and very occasionally? A small buffer size is likely ok. Are events written in chunks (say 15 events at a time)? If set to 10 this would likely blow up the buffer causing the subscription to issue read operations instead of working off the live messages internally (this causes higher latency and possibly more disk based operations).

A secondary usage of the “buffer” is that the subscription can have its buffer filled, and begin getting messages to a client(s). While this is happening asynchronously it is getting more messages. The subscription itself is pushing messages buffer->clients while the buffer is filling. In other words the reads/processing happen asynchronously.

Under most circumstances the defaults (100 IIRC for the buffer) should be reasonable. For very fast streams it might be desirable to increase a bit. For streams with very large events (say 4mb/event) it might be reasonable to decrease a bit but overall the defaults should be fairly close to what most want with a preference towards being too large than too small.

Cheers,

Greg

Thank you very much for the information.