Manually replicating EventStore

Leo_Gorodinski · May 12, 2015, 4:10pm

We’ve a scenario where we’d like to replicate a non-clustered EventStore node to a cluster. We’ve a component that reads the $all stream and writes those events to the replica. The specific scenario is that we’d like to replicate data to a cluster and then switch to using the cluster within minimal downtime. Some questions:

Currently, the replicator filters event types starting with $. Does this make sense or should all events be replicated? On the one hand, stats events seem superfluous, one the other, metadata events could be important? For example, deleted streams don’t come through.
The replicator copies the event in its entirety, and provides an expected version when writing. It orders writes to individual streams, but parallelizes across streams. Is this acceptable? It seems to be working, but I’m not sure if it could cause issues.
We noticed that when using the built-in IEventStoreConnection.SubscribeToAllFrom function, when the replicator catches up and switches to push mode, we start running into WrongExpectedVersionException. If we leave it in pull/poll mode, we don’t see those issues. Could be an issue with our code, but I’m wondering if you’ve seen this before?
In using this process to switch to a cluster, we’re looking to see if we can eliminate downtime. One option is to create a layer on top of the EventStore client, and in a similar vein to the EventStore client itself, take care of switching to the async replica upon a trigger. In this case, the trigger could be an explicit request indicating the replica is caught up, rather than a heartbeat timeout. Is there a better way to do this with EventStore?

Thanks!

Phil_Bolduc · May 12, 2015, 5:24pm

Let Event Store replicate. Configure 2 more nodes to point to the first one. They should catch up and be clones. Then just restart the nodes with cluster config. Have tested this approach going from 3 to 5 node cluster.

Greg_Young1 · May 12, 2015, 9:44pm

Currently, the replicator filters event types starting with $. Does this make sense or should all events be replicated? On the one hand, stats events seem superfluous, one the other, metadata events could be important? For example, deleted streams don’t come through.

For soft deletes yes this is important. Hard deletes are not implemented through an event at this time. Stats can be scavenged on the target node anyways.

The replicator copies the event in its entirety, and provides an expected version when writing. It orders writes to individual streams, but parallelizes across streams. Is this acceptable? It seems to be working, but I’m not sure if it could cause issues.

Given a stop/restart of the writer you should fall back to idempotency writes and it should be ok.

We noticed that when using the built-in IEventStoreConnection.SubscribeToAllFrom function, when the replicator catches up and switches to push mode, we start running into WrongExpectedVersionException. If we leave it in pull/poll mode, we don’t see those issues. Could be an issue with our code, but I’m wondering if you’ve seen this before?

Without anything more than “we see wrong expected version” I can’t say anything intelligent on this. I would need more information.

In using this process to switch to a cluster, we’re looking to see if we can eliminate downtime. One option is to create a layer on top of the EventStore client, and in a similar vein to the EventStore client itself, take care of switching to the async replica upon a trigger. In this case, the trigger could be an explicit request indicating the replica is caught up, rather than a heartbeat timeout. Is there a better way to do this with EventStore?

I dont understand what you are asking. Are you looking for “how to tell if a replica is caught up”?

Leo_Gorodinski · May 13, 2015, 12:56pm

I dont understand what you are asking. Are you looking for “how to tell if a replica is caught up”?

The overall question is how to transition a single node EventStore to a cluster without downtime. One issue is that projections don’t work in a cluster, so as part of this I would like to also replicate to a single node that will host projections. So the goal is to transition a single node to a cluster of 3 plus a projection node. I’ve a replicator as described above and was planning to use it to replicate from the single node to the cluster, and then from the cluster to the projection node. Once all is caught up, all services can point to the cluster and projection node. The question is about the best way to make this transition. Would @Phil’s suggestion work if we start the cluster and point at a single node? If this gets the cluster going, we can get the projection node going with the replicator and downtime won’t be an issue with this since services dependent on projections are usually a bit behind anyways.

Greg_Young1 · May 13, 2015, 1:40pm

"The overall question is how to transition a single node EventStore to
a cluster without downtime."

Can't be done theoretically or in practice. There will always be some
period of downtime. This is due to needing to reach consensus. There
is at least an election required to determine what the size of the
cluster is.

"One issue is that projections don't work in a cluster, so as part of
this I would like to also replicate to a single node that will host
projections."

You mean occasionally there is an issue during a failover from one
node to another or? Projections are intended to switch between nodes
and do the last time I checked.

There is also a change in dev that likely will help here in the
ability to create through replication a node that is purely a clone

If you made a node non-promotable (clone only) and only ran
projections on it I think it would give you most of what you discuss
above (projections running on warm replica) and would do so without
any replicators etc

Leo_Gorodinski · May 13, 2015, 2:11pm

Can’t be done theoretically or in practice. There will always be some

period of downtime. This is due to needing to reach consensus. There
is at least an election required to determine what the size of the
cluster is.

I meant downtime with taking this into account. Switching between leaders will result in downtime, but it can be masked by an automated retry.

You mean occasionally there is an issue during a failover from one

node to another or? Projections are intended to switch between nodes
and do the last time I checked.

The behavior that we observed was that projections would start to work in a cluster, but (it seems) when a switch occurred they wouldn’t recover properly in most cases.

Greg_Young1 · May 13, 2015, 2:13pm

The change I mention is not in dev its on a branch and needs testing
https://github.com/eventstore/eventstore/tree/nonpromotable

Bernard_Kowalski · May 24, 2015, 4:20pm

Is there a list of “system” streams that should always be excluded from the replication ?

jen20 · May 27, 2015, 4:26pm

Anything stream that starts with a $, and any metadata stream for a system stream (so those starting $$$). There is no explicit list as it’s configuration dependent (you end up with a stats stream per node with the IP address and port as part of the stream name).

Also if you run custom projections you may want to exclude their output.

Greg_Young1 · May 27, 2015, 4:27pm

Yes but you may not want to exclude everything with $ and $$$ as soft
deletes as example are $$$