EventStory copying and event ordering

Hi!

We are using a get-all subscriber to one ES cluster to get all events,
and then write them into another one. However, there seems to be
reordering going on when writing to the new cluster, which is bad. Is
there a way to ensure that the order events are written to a
connection is the same they will be stored? I could wait for each
event to be individually written, but that would be crazy slow. So
basically, batched ordered writes is what I'm asking for.

/Rickard

From a single connection see the "better ordering" command line
option. The reason this can happen has to do with some operations
being asynchronous internally.

Your subscription will have perfect ordering and if you read/write you
will have perfect ordering. Once you get into async writes and many
running concurrently it does not assure ordering between them unless
you run with better-ordering.

Hi Greg!

Thanks! Unfortunately I'm using the JVM client. What I would do with
other databases is to open a transaction, write 1k of updates, then
commit and wait on that to finish, before doing another batch. Is
there any similar concept I can use here? I know there's a batch write
in the HTTP API. Anything in the JVM client?

/Rickard

It is a setting on the cluster node.

Ok, I read the docs for that setting, but not sure if that will help
me. Isn't it the client side that needs to ensure that events are
written in the correct order? I'm currently using ask(connection,
WriteEvents, timeout) method to do the write. Are you saying I can
invoke that and the write order will be exactly the same as the order
I invoke that method?

Again, what about transactions? Would that be of any help, or does it
not help with ensuring write order?

Also, I only need write ordering when I'm setting up the cluster and
seeding it from another one (blue/green deployment). During normal
operations I don't care about write order, so a cluster node setting,
if it indeed does the ordering, is not ideal.

regards, Rickard

I would need to test the semantics of the client but I believe it just
queues them internally and keeps them in order the same as the .net
client. The server side setting has to do with authentication and
trying to keep things in order.

Ok, that is interesting. Then I can test that setting.

So basically:
* turn on better-ordering on the server
* have client just do the ask() call, without waiting for it to finish
* the client library will send events to server in same order as ask()
was invoked, and server will try to preserve that order

Sound about right?

regards, Rickard

If you only have 1 outstanding request at a time it will always be in
order. The setting has to do with multiple concurrent requests from
the same connection.

That's weird. We just did a blue/green deployment, where one new
cluster was seeded with events from our main production cluster. So a
client that subscribes to the main one, and then saves to the new one,
and then we switch over when it's all done. We got all the events, but
they were out of order. That's why I was thinking that the issue would
be that my code does ask() on write but doesn't wait for it to return
with write confirmation, and so either on client or server there was
some reordering going on. Today I've had to do the same thing again
(from the now broken production cluster) into a new environment, but
try to reorder the events back into the original order (thankfully
each event has the original write position, which I could use for a
sorting sliding window), and by only writing one event at a time and
wait for confirmation before sending the next one. Super slow, but
seems to work. There is a reordering in that pipeline somewhere, and
it would be good to know exactly where, so I can make it faster while
being safe at the same time. Any ideas at all would be appreciated.

/Rickard

So there is reordering that can happen in the server via
authentication. The --beter-ordering option takes away most of the
cases for this. There could also be some in the jvm client but I have
not tested it. Note that this still does not assure perfect ordering
as you can get things like a timeout on a request (you put in a bunch
of requests after it ...)

So what about the transaction thing? If I can bunch up 1K events as a
tx, send it to server, synchronously wait, and then do the next batch,
that should be safe.

/Rickard

I think just setting --better-ordering will solve it for you (except
for retries) but transactions likely will end up with an issue there
as well.

So, I tried setting BetterOrdering: true and changed my client to not
wait for write confirmation on the ask() call. Seeing reordering of
events still, so no luck :frowning:

/Rickard

That must be in the jvm client then (unless you are getting timeouts)

Yes, I agree, I think it's an Akka thing. Incredibly annoying. I can
replicate locally, with the only change being whether I wait for write
ack or not. No ack: reordering. Ack: no reordering. So, Akka must be
internally having some randomness in how it processes the ask() calls.
Are there any conf settings for this?

/Rickard

ps. Before this I absolutely detested Scala and anything built on it,
and this doesn't exactly help.

Created a minimal test (still using internal code, so can't post it on
GH), and yeah, seems like the client does reordering:
https://github.com/EventStore/EventStore.JVM/issues/68

/Rickard