Tombstoning and recreating deleted streams

Bryan_Watts · January 11, 2014, 4:45pm

I am writing a feature that allows our development and QA teams to restore an EventStore instance to a known state. This is highly desired for both functional and performance testing.

We have a stream per aggregate, process manager, and projection. A “state” of the domain is simply this set of streams, with their events, at a certain point in time. I already have all the mechanisms in place to record the stream states, which was easy.

My plan was, for each stored stream, to delete the existing one and recreate it by appending the stored events. However, I quickly encountered the “Event stream ‘…’ is deleted” error.

In reading up, I understand the issues with HTTP caching and tombstoning. I also understand the recommended solutions of creating new streams with different names.

However, the IDs we are using are present in message headers and other places, and it would be a nightmare to try and migrate them. That simply isn’t an option, nor is it semantically valuable.

I would like to manipulate EventStore streams at a mechanical level. Is this possible with the current binaries or any available build?

Thanks for any help!

Bryan

Greg_Young1 · January 11, 2014, 4:50pm

Have you seen truncate before? It works as you say the only difference is that your new stream will start at say 37 instead of zero. You can set it by setting

$tb : number

On stream metadata. It also gets around all caching etc issues

Greg

Bryan_Watts · January 11, 2014, 5:01pm

I hadn’t seen that, though I think I read a suggestion along those same lines in prior posts.

I infer that this results in a stream with the “old” and “new” events, and “$tb” is a marker telling EventStore when “new” starts.

Is that right?

jen20 · January 11, 2014, 5:15pm

Yes, what it means though is that your streams might not start at event number 0.

Greg_Young1 · January 11, 2014, 5:16pm

Exactly. Truncate before says events prior to position are eligible for scavenging.

Yuri_Solodkyy · January 11, 2014, 5:22pm

And any reads will not return events before $tb

Bryan_Watts · January 11, 2014, 5:27pm

That’s perfect. It allows me to maintain a logical stream within the physical stream, as long as I incorporate the value of $tb into any reads for event replay.

Side question, but relevant in context: has anyone run into issues with event positions being capped at ~2 billion (int32)? I decided it wasn’t a concern with normal system load, but advancing the stream N positions on every “restore” could make that untenable.

Thanks again.

Greg_Young1 · January 11, 2014, 5:58pm

How many events per restore? How many restores/day?

Bryan_Watts · January 13, 2014, 3:08pm

There are three scenarios in which we restore:

We instance EventStore per developer, so they may choose their own arbitrary restore points (extremely useful during development)
QA deployments
Performance testing

2 is a non-issue, as we always start from scratch on those deployments. 3 is a massive set of events, but again we can always start from scratch on those deployments.

1 is the most unknown of the three; developers can do whatever they want, and I wasn’t sure how to bound it. The only constraint is the event number cap, which is why it received my attention.

I have a scenario which is further down the line, but will put much more pressure on this particular aspect. I’m curious to hear everyone’s thoughts.

I envision a DVCS-like system, with the ability to “branch” and “merge” streams. This would all be part of the runtime of the system, so we don’t have the luxury of starting from scratch. EventStore’s model is absolutely perfect for this, sans that feeling of barreling toward a brick wall of event positions.