Large number of streams concern

Hey

I would like to store documents (saga state) in ES and have indices on various properties of them (in order to look up a saga instance). I’d like to validate with you my approach as I am a bit concerned by the number of streams it will generate.

For each Saga type I would have a projection like this one:

fromCategory(‘SomeSaga’)

.when({

SagaSaved : function(s,e) {

linkTo(‘SomeSaga_ByOrderNumber-’+e.data.orderNumber,e);

return s;

}

})

For each property I’d like to index, for each active saga instance, it would create a stream with name containing the value of the property. So let’s say I have 1M active sagas (I guess this is a very high limit) and 5 index properties, I would end up with 5M streams only for indexing sagas. Wouldn’t that kill ES performance?

Szymon

These would definitely amplify your writes. That said I would guess you would still be at an acceptable level of performance especially if done in a single projection. It is worth testing/benchmarking.

Greg

Thx. I think the number of unnecessary writes can be decreased by adding ‘hasChanged’ flag to modified properties when saving. This way I will only emit new index events if the indexed property has actually changed and since saga routing is usually based on some IDs which does not change frequently, there will be probably one index event per property, emitted only when new saga instance is stored.

Szymon

You can do that using your state as well instead of storing it on the message.

Wouldn’t storing this in state for each saga instance impact perf more than just plain emitting events every time saga is saved? Or to rephrase this question, are stateful projections much more demanding than statless?

state does not need to be written out on every change (its checkpointed)

State is persisted only as checkpoints. When a projection is recovering after the system restart it makes sure not to write already emitted events twice.

-yuriy