Sharding/multiple replica sets

Greg_Young1 · October 23, 2014, 1:41pm

There has been a lot of talk leading up to sharding/mt with event store to support say up to 300 servers. If anyone has a particular need for this could you please email me with some details? I am trying to prioritize when/how the work should be done. As of now most systems can be relatively easily linearized on a single replica set so it’s been low priority.

Do you need it for size of data or throughout?

Do you need multiple replica sets for tenant isolation (eg multiple DBS) or do you need the ability to shard a single tenant across many servers?

What is the timeline you are looking at?

Are there any other requirements you may have?

Cheers,

Greg

Fredrik_Skeel_Lokke · October 24, 2014, 9:06am

We definitely have some important use cases that would require sharding. I will do some back of the envelope calculations monday and get back to you.

Fredrik_Skeel_Lokke · October 29, 2014, 12:11pm

We are looking at writing/storing 50.000 small events per sec for about 20 years.

The events would not have to be stored in a single stream, but can be partitioned naturally into a few thousand streams.

We imagine the number of consumers being below hundred…

Greg_Young1 · October 29, 2014, 12:38pm

What about elasticity how do you want to scale out? Is a set number from beginning good enough? Do you want to grow easily to say 100-500 machines without thought?

Along with this do you need custom hash functions?

Cheers,

Greg

Fredrik_Skeel_Lokke · October 30, 2014, 4:20pm

A set number would be fine.

Fredrik_Skeel_Lokke · October 30, 2014, 4:22pm

Can you elaborate a bit on the custom hash function part?

Greg_Young1 · October 30, 2014, 4:39pm

Eg you want to handle hashing to a node yourself

Fredrik_Skeel_Lokke · October 30, 2014, 5:35pm

I could imagine that being very usefull yes. So we could control the distribution of fast growing streams.

Greg_Young1 · October 30, 2014, 5:37pm

You could do that either way it’s more along the lines of domain specific sharding eg all data for this month should have locality

Poule_Dodue · June 16, 2017, 6:45pm

say a event is 300bytes, the sum is 1To per day;
Did you implement it with EventStore?