Write performance of ordered event batches over multiple streams

We are generating large batches of events where each event belongs to a different stream. What is the most performant way to write the events in the order generated and ensure ordering of the $all stream? I could not find a client API to write batches of events across multiple streams.

James,

How are you able to write a batch of events to multple streams?

Yes.

  • A large number of events are produced in many “batches”
  • One or more events in each “batch” may belong to any given stream, but this is quite random
  • Objectives are to have the most performant write throughput and ensure $all reflects the order the events were created over all “batches”
  • I’m wondering if it is possible to write large “batches” of events
  • From what I can find, the client API supports appending 1-n events in the context of a stream

Have thought about using multiple writers, but this wouldn’t preserve the ordering of $all.

James,

Struggling to get my head round this.
How are your batches initially created? Are they beign read from an Evenstore to start with, then batched up, then rewritten?

  • New events are created en-masse by a process
  • There is no domain concept of “batch” pur se, it’s just an expression I use to mean writing a large block of the new events in one round-trip
  • The eventstore may already have a stream that any given event may logically belong to

In many ways, what I want to do is append to the $all stream (which you cannot do) or append an EventRecord to the transaction file.

What is the business goal of what you are doing ?

The goal is copying events from one ES to another
I see BatchAppend has been added which looks to be what I was looking for

have a look at https://replicator.eventstore.org/

The BatchAppend is only implemented on the server-side and in the proto contract but hasn’t been included in any of the client APIs. In any case, it will behave similarly to the current append to stream with multiple events provided. The correlation id, however, seems to be a single value per batch, and it moved from the proposed event to the batch level itself.

However, you still can only append events to a single stream, batched or not batched.

When I was building the replicator, I considered optimising for same-stream sequences. It doesn’t explicitly need to be a batch to append multiple events in one network call. And the network roundtrip is what takes time, on the server it writes to disk quite fast.

Still, in the majority of use cases, multiple events produced for one stream at once, so they appear one after another in $all is quite rare. Therefore, this optimisation is not included. Unless the client system is designed in topic-subscription broker style, it won’t produce any visible effect.

That’s why Replicator has more advanced partitioning, filtering and transformation features. Filtering out unnecessary events and partitioning writes so we can have multiple concurrent writers - these are the main performance winners.