Persistent Subscription Limits

I’m new to EventStore and have been playing around with some features, mainly persistent subscriptions.

I’ve been experimenting with creating large numbers of persistent subscriptions - I have a figure of 20,000 in mind. I’m experimenting with this feature to explore server-side check-pointing for stream consumers (removing the need for clients to do this themselves). I am mainly experimenting with creation of subscription groups, rather than using them at this stage. What I’ve found so far is that maintenance of the persistentSubscriptionConfig stream is a limiting performance factor. Every time a persistent subscription is created a snapshot of the configurations for all subscriptions is saved to the stream (I believe this stream is read from the tail when ES starts up to get the system configuration for persistent subscriptions). As large numbers are created this process gets slower and slower.

Just wanted to know if I’m barking up the wrong tree, and trying to create far more subscriptions than is sensible. Have been looking at ways of optimizing things.

Thanks, Tim.

It was never intended to have 20,000 concurrent subscriptions. Maybe
you can discuss your use case a bit?

Hi Greg,

This is probably a cruel and unusual use of Event Store, but experimenting with different technologies and designs.

The use case I’m playing with is command distribution to processing nodes (aggregates). The basic idea is to create streams per aggregate and then select a node and give it the stream details so it can start processing it. The persistent stream comes into play as a way of check-pointing where a stream has been processed up until so that the stream can be moved to a different process - a rebalancing and failover algorithm. The large number represents a large number of aggregates. The message rate per aggregate would be low, and they would not all be processing concurrently.

Just playing with persistent subscriptions for now.

Thanks.

Maybe I am missing something but I can't come up with any reason why I
would want a persistent subscription (e.g. a queue in rabbitmq) per
aggregate. This just sounds like a bad idea. What would I gain by
having this?

"The persistent stream comes into play as a way of check-pointing
where a stream has been processed up until so that the stream can be
moved to a different process"

I can kind of get a queue per process .. Just don't see any advantage
to queue/aggregate

The queue per aggregate does result in a large number problem, and probably makes it untenable.

The reason for experimenting with this over a queue per process, is that it makes moving processing of individual aggregates to another process easier. The aggregate stream is check-pointed and in a known state so it’s easy for something else to pick it up. A queue per process (which may have commands for many aggregates) seems much more difficult to deal with, the large number problem probably rules out the simpler per aggregate queue.

The real intent is to maintain many checkpoints, rather than many concurrent subscriptions.

Actually I guess with a check-pointed queue per process, it’s not too difficult to shred the queue on node failure and copy the commands to other queues as I’d know what had been processed.

One remaining thing the aggregate queue gives is that they are independent and not susceptible to the situation where a process level queue has a high number of commands in series for one aggregate instance and blocks command processing for the other aggregates (I am making assumptions here though on how per aggregate queues may work out in reality).