Replicating events across datacenters

Hello,

I’m looking to run Event Store HA in multiple datacenters.

The requirement is that all events should be published to all datacenters.

Here’re some approaches:

  1. Create an Event Store cluster that spans datacenters.

Con: too easy to lose half the cluster and the quorum with it.

  1. Create 1 cluster with multiple replication groups, 1 per datacenter.

Con: I can’t find in the documentation whether Event Store can do that.

  1. Put the initial event through a globally distributed queue. Then in each datacenter,

extract from queue, publish to the local Event Store cluster, do all other processing locally.

Cons:

  • more technologies involved

  • (aesthetic?) command processor that generates original events

has to read from the local Event Store, but write to the queue.

  1. Write an event listener that copies missing events between datacenters.

a. Listen for all events in the local datacenter and publish them to others.

b. Listen for all events in other datacenters and publish them locally.

Cons:

  • have to be very careful not to get into infinite loops

  • may have to worry about message ordering

  1. Anything else?

Any recommendations or documentation I can read about this?

Thank you,

Igor.

Why is it too easy to loose your quorum

If you have 2 datacenters and the link between them is not reliable.

That’s not how the replication works have you read http://geteventstore.com/blog/20130301/ensuring-writes-multi-node-replication/ ?

Run with three data centers, so long as any two can talk you will have a quorum

Yes, I have read it before and just read it again. I'm familiar with the model.

Unfortunately, there are only 2 data centers. :frowning:

Amazon node?

This is in-house. Can't go to Amazon. I wish I could.

Since I can don't really need immediate consistency, even if I had more data centers, I would want each in read/write mode when if it's disconnected from the rest of the world. I don't have any conflicting events.

Run single node in each. Write a small program called shovel thT does a subscribe to all and pushes to other node. Obviously there is no consistency at this point.

I’d run a cluster in each for reliability.

But basically, this is approach 4a above. Then, I’d have to add datacenter name to metadata for each event,

so that shovel would only distribute locally generated events.

Thank you.

Or put the data center in metadata and in shovel check it before sending so you can have the same stream names to make other code simpler. The drawback of this style of course is that you will not have the same ordering and consistency

Right. That’s exactly what I was saying. Thanks again.

Sorry to dig up an old thread, but it’s an interesting topic that I’m considering right now (running an EventStore across multiple regions). But compared to Igor, I would prefer to keep the ordering of events consistent across all DCs, although eventual consistency is somewhat acceptable.

My understanding of ES’s HA/multi-node is that all writes are processed by a single master to ensure consistency. I’m not sure this is sensible in a multi-region setup with high-throughput, as each write would be routed to a node that, most of the time, is in a different region, inducing several hundreds of ms latency, not mentioning the latency to reach the quorum with distant nodes -> Q1: Am I right here?

I think that Igor’s option 3 (using something like Azure Service Bus Queues) sounds like a good way to get separate EventStore clusters eventually in the same state, but I’ve just realised that this would induce a similar write latency as clients would need to push the log to a SB queue that is probably distant. The write would still be faster though, as there would not be an attempt to reach a cross-region quorum -> Q2: has anyone implemented such setup? any feedback if so?

Lastly, I was wondering if an implementation of an eventually consistent event store is something feasible. By that I mean a platform where, upon synchronisation between nodes, logs could be inserted before the last one if it happened before on a different node. I understand that the key here is temporal sync, but is this something that vector clocks would permit? -> Q3: please comment! :slight_smile:

Cheers

Thomas