Event Store Scaling

I’m very new to Event Store, and event sourcing in general. I was hoping to get some questions answered about design patterns for system architecture and scaling.

First off, I am writing a social app which (i hope) will take off and require massive amounts of storage. I was hoping to host EventStore on Azure (possibly in service fabric), however I’m not entirely certain what the scaling ramifications are. In a clustered deployment, does adding more server instances equate to adding more to the quorum count or adding more partitions of data? I don’t want to scale up instance count only to just be adding latency to writes because the only thing its doing is adding the more noise to the quorum. My primary concern is data scale, and latency - I don’t have a ton of benchmarks yet to give, but I was hoping for some general guidance on good production deployments.

Second, I am building microservices to facilitate the application architecture. Should I host a separate instance of Event Store per microservice? Furthermore, should I front end the event subscriptions with service endpoints so that each microservice encapsulates the details of the underlying store?

Finally, I would imagine the given a domain X i would have a stream for each aggregate entity in X (X-123). Is there a way to consume all events for X instead of consuming X-123 individually?

Hope this make sense, I’m a bit lost/ star struck by event sourcing and need pointed in the right direction. Sorry if these are asked in other places, I just wanted a concise place to get answers.

Adding nodes to the cluster adds availability to the replica set. We
have discussed adding sharding on top. At a basic level its not that
hard to implement (client side only will get you to +- 16 partitions
with some devops). There are trade offs to sharding that I will
mention later.

"Second, I am building microservices to facilitate the application
architecture. Should I host a separate instance of Event Store per
microservice? Furthermore, should I front end the event subscriptions
with service endpoints so that each microservice encapsulates the
details of the underlying store? "

This is a series of trade offs not a yes or no answer. I feel less bad
about sharing events between services than sharing structure. There
are many things you can do in between as well such as using
stream-prefixes/suffixes and/or adding acls on a single node. It
really depends on things like scalability requirements/availability
requirements/operational complexity/etc.

"Finally, I would imagine the given a domain X i would have a stream
for each aggregate entity in X (X-123). Is there a way to consume all
events for X instead of consuming X-123 individually?"

Yes enable $by-category projection and a stream will be created $ce-X
with all the events from all the Xs. Also those events will come in
perfect order.

That last part is one of the trade offs to consider with sharding/es
per service etc. The moment you try either you will have multiple
subscriptions to think of and you will lose the concept of ordering in
the system. Ordering is a nice thing to be able to assume until you
can't for other reasons.

Thanks Greg, this helps tremendously. I am curious, how many upfront decisions are ‘unrecoverable from’ or ‘hard to recover from’? The heart of this question is about building software on top of a fairly new concept for a product. I’m not certain the load characteristics yet, other than I don’t want to not be able to scale up/out.

I’m reading all the documentation, and I don’t see a ton of literature on production deployments. I’m basically a two dev shop so offloading a lot of infrastructure management is super important to me. Are there any resources out there that walk through deployment scenarios?

On the commercial side we provide AWS based terraform scripts that
setup a full cluster as well as vpc/scalability group/bastion node
etc. But they are a starting point. Most of the setup is more provider
specific stuff (like building private networks handling DNS
availability groups etc).

we are looking into providing managed clusters shortly though they
would be in AWS to start.

can you elaborate on the 16 partitions strategy?

and as a second question why 16 and not 256 or 4096

Shard at client. Remember that the management is manual at this point.
Going to 256 or 4096 nodes you would really want more automation
involved.

Hi Greg,

Do we need to go to the commercial version to have the sharding feature ?

We are currently on Azure, it is possible to have it ?

Thanks,

Johnny