Stream partitioning guidance

Kasey_Speakman1 · March 11, 2014, 8:02pm

Hey y’all,

I know that streams can be partitioned multiple ways using projections, but I am wondering about the tradeoffs. In general, is it better to have a large stream that is split with projections, or many streams that are joined with projections? Is ordering more expensive when you join multiple streams as opposed to splitting one stream? What circumstance would lead me to use emit instead of linkTo on a projected stream? The hardware is planned to be a 3-node failover.

I’m looking at 3 general types of streams so far:

Command streams - one stream for the active command handler (shared storage to record the stream position)

Events streams - stream per aggregate, joined streams (maybe all events) for denormalizers, a few specific projected streams for integrators

System streams (things like command failed/succeeded) - one stream for notification purposes. probably projected a couple of different ways from message properties

I’ll also want to make command to event causation projections for regression testing.

Spot anything wrong or inefficient there?

Kasey

Greg_Young1 · March 11, 2014, 8:25pm

So …

emit: emit is emitting a new event not a link to an existing event eg emit(‘foo’, {hello : “goodbye”})

In most systems you are better off with many streams with few events each and then joining to larger. There are however some that are better with one large to many small (audit queue for nservicebus as example). I would need more details.

In general when querying with projections you want many streams and a stream per result as you can use .foreachStream on those queries which will automatically parallelize them

Kasey_Speakman · March 11, 2014, 8:30pm

Regarding emit, what circumstance would cause me to use that versus linkTo? It seems like linkTo would be better in general.

Kasey_Speakman · March 11, 2014, 8:31pm

Nevermind, emit would be for state projections.

Greg_Young1 · March 11, 2014, 8:33pm

a process manager or a feature detecting projection.

Here is an example for you

http://geteventstore.com/blog/20130218/projections-4-event-matching/

another

http://geteventstore.com/blog/20130217/projections-intermission/

Greg

Kasey_Speakman1 · March 17, 2014, 6:12pm

I ended up with stream per aggregate for command streams also so I could query for commands run against a specific aggregate for debugging purposes. That’s about as fine-grained as I could think of for now.
“System streams” will be more for notifications, so I’m currently looking at a stream per client ID. The client will listen for things like CommandFailed for commands that it sent.