Hello, I’m not working on anything in particular and I’m just going down the rabbit hole of understanding distributed systems and event sourcing for potential future projects.
I saw that there was a benchmark boasting the 15k writes per second and 50k reads per second (I think that’s right :P)
I was curious what happens in the case where you might scale internationally and may end up needing more than 15k writes per second. How do you scale eventstoredb past this? or was this just a benchmark on average hardware? I find it’s quite unclear.
Any discussion around this would be much appreciated!
You need a fast disk ESDB only scales up atm as it’s a replica set with a single leader.
On the other hand:
this is a benchmark.
Most Line of busnieses system don’t generate that much data.
Some quick maths:
15K / sec means 54 millions per hour.
That meand , with let’s say a 1Kb messages around half a TB per 8 hours
The problem then is one of storage
but more importantly consumptions: can your downstream processing ingest data fast enough to keep up with the ingress ?
your conusmerr need to always be able to consume faster than the rate of new events coming in
I do wonder what type of system you have that would need more than 15K / sec ?
I work in the gambling industry and as you can imagine, I’m certain markets which are highly regulated, traceability is a huge concern. If you consider a bet placed as an event which would need to be stored, you would be capped at 15k online players hitting “spin” on a slot game, because each spin would be classed as 2 events really, some sort of bet placed event and some sort of bet settled event. So if we hit over 15k active players at any time (we’ll really 7.5k since there would be 2 events), which from experience I know happens, then it would mean latency issues which to my knowledge here would be unresolvable without some hacky solutions in the front. At least that is my understanding of this anyway, please correct me if I’m wrong. I would love to have an excuse to try eventstoredb.
without going into to much details , for obvious reasons
We do have customers & users in this type of industry.
the type of information they track seems to be slightly different to what you explained and realtime , for big events, realtime : on the ingestion side and on the comsumption side . They also have multiple clusters for different parts/boundaries of the systems
Some insight into this would be useful in general, just how to achieve this at a much bigger scale than 15k writes per second. Like what type of trade-offs you could make.
One thing you can consider: sharding your data. You will lose the global ordering guarantee, but your domain may be tolerant to that.
1 Like
That’s the way. In betting that’s a totally fine approach. Think football vs basketball.
The most obvious way to get over 15K (which is an indicative figure anyway) is to get a bigger box with faster disks.
It will change for sure during the next 18 months.