Max subscriptions

Sebastian_Stehle · August 16, 2018, 3:13pm

Hi,

we have already used eventstore in a small CQRS application and consider it now for an IOT project. Is there a limit how many subscriptions we can create per consuming server? The idea is to open one subscription per customer. We have around 2000 customers at the moment. Is there a limit we have to be aware of?

Greg_Young1 · August 16, 2018, 3:54pm

I am guessing these notifications with 2000 customers are quite rare and being distributed. Why not use http at this point?

Sebastian_Stehle · August 16, 2018, 4:47pm

What do you mean with notifications?

Each customers gets max 1 message per second from the sensor, very often less than that. We use a rule engine (drools) to evaluate the sensor events (not a fan of) and the knowledge base is relatively expensive to create (sometimes several seconds). Therefore we use an actor framework (orbit) to distribute the knowledge bases to multiple servers. We also have to query the events to recreate the knowledge base after a deployment. Therefore the idea was to keep each customer independent and to subscribe to the event stream of the customer inside the actor and to also store the position somewhere. Catch-Up subscriptions seems to match very well to our use case. Of course we could also use a single subscription per server or use pulling with http to get the events.

Austin_Salgat · August 17, 2018, 4:56pm

Have you considered subscribing to a projection (such as the Category projection, where the category is Customer) and routing the customer events to the appropriate actor once you receive it? Then you only need 1 catchup subscription for all customer events.

Sebastian_Stehle · August 20, 2018, 11:08am

Sure, therefore I ask if I have to do it…

Greg_Young1 · August 20, 2018, 11:19am

You can also do subscriptions over http in a case like yours which will also use less resources (especially when you have a rate of < 1 message/second). This is what I was alluding to.

I am sure there is also some domain specific logic where you might be able to trade off the amount of time for a notification to arrive vs overall resources used. Given the low volume of messages I would seriously consider the possibility of using http as opposed to opening say 2000 subscriptions here. This option also has some other benefits such as other things being able to look over http as well

I would also consider a single subscription that then dispatches the messages internally. This again will be far more efficient (and easier to manage! in terms of checkpointing etc) than 1000s of subscriptions.

Does this make sense?

I would lean heavily away from

Us

Greg_Young1 · August 20, 2018, 11:20am

DOH “I would lean heavily away from making 1000s of subscriptions over the TCP API”

Sebastian_Stehle · August 20, 2018, 2:10pm

The code would be simpler when I would create one subscription per customer because I can just use catch up subscriptions to recreate the customer’s state after a deployment. But if It can cause performance issues I can also create a single subscription per server. But the question is: why? What resources are allocated per subscription?

Greg_Young1 · August 20, 2018, 2:23pm

To start with a socket as well as internal resources looking at what is occurring internally. If its 10 this is obviously no issue but creating 2000 sounds like quite a few. There are likely better ways of handling this. Have you considered as example creating an equivalent of $all per customer via a simple projection where all a customer’s events end up in a single stream based subscription?

As example:

fromAll().

when({

$any : function(s,e) {

if(e.customerId) {

linkTo(“customer-” + e.customerId, e);

}

})

This will create a stream customer-1234 (etc) with all events associated to customer-1234 in it allowing for much easier replays of all data for a given customer. It can also be useful for lots of other purposes

Does this make sense?

Cheers,

Greg

Sebastian_Stehle · August 20, 2018, 4:11pm

I think I do not need projections, because I would have a single stream per customer anyway.

I had a look to the code and for my understanding is that when a subscription is created the client sends a message to the server “I am interested to get the new events for stream X, please send them to me and use correlation id Y” and when a new event is added to a stream it sends the event over a single socket to the client that declared interest. So there is only a single socket per node and not per subscription. So a subscription only require a few entries to dictionaries here and there.

Or do we talk about different subscriptions?

Benjamin_Boyle · August 21, 2018, 3:00am

Is there a misunderstanding happening here, whereby Sebastian is suggesting 2000 subscriptions on a single socket connection, and others are interpreting that as 2000 socket connections?

I can see many advantages to Sebastian for having 2000 subscriptions instead of one. For example, it’s easier to keep checkpoints while also concurrently handling the events. If there was just one subscription to an aggregated stream such as $ce-customer, each event would have to be handled by Sebastian’s application one at a time before the checkpoint can be updated, or more complicated checkpoint logic would have to be used.

Sebastian_Stehle · August 22, 2018, 6:12am

I am not suggesting something, I just try to understand it

I also think that multiple subscriptions are easier to manage for me, because our actors are very independent. We use orbit (https://github.com/orbit/orbit) and actors are managed by the runtime and can be moved from one node to another in some scenarios.

Sebastian_Stehle · August 28, 2018, 7:54am

I made a simple test and created 10.000 subscriptions. And with sysinternals I can only see a single socket for my process.

Greg_Young1 · August 28, 2018, 8:37am

Subscriptions created over the same connection operate over the same connection.

Sebastian_Stehle · August 28, 2018, 11:17am

Then it should be fine to create one subscription per actor.

Greg_Young1 · August 28, 2018, 11:38am

if you have lots I would probably introduce a single subscription then have that subscription dispatch internally to your system (there are a whole ton of reasons this becomes simpler!)

Peter_Hageus · August 28, 2018, 2:34pm

Even though there is only one socket open (one per connection), you will receive the same event 10.000 times in your test.

/Peter

Greg_Young1 · August 28, 2018, 4:25pm

There is a reason for this. We don’t know how your internal dispatching etc works so dispatch a subscribed event on every subscription. If you have 10k subscriptions on the same connection we will send it 10k times.

As a counter example of why we do this imagine you have a quick little plugin system and 5 plugins which each subscribe to a few events. Would you prefer to write your own internal dispatching or have the network cost of the same event delivered multiple times? You can obviously build the internal dispatching but for many systems this is overkill no?

Sebastian_Stehle · August 28, 2018, 6:01pm

I am so confused, probably I was not clear enough. I have one stream per customer and the events of a customer are totally separated from the events of other customers. So basically 1 stream + 1 actor + 1 subscription for customer. So the actor for customer “123” would create a catch up subscription for stream “customer-123” using a single shared connection per server.

If I create a single subscription per server I also have the network costs, because server A could receive an event for a customer, which actor sits on server B, so he also has to forward the event to the other server.

Greg_Young1 · August 29, 2018, 3:38am

You’d be amazed how cheap those costs are vs maintaining 10000 subscriptions.