Snapshotting - sensible defaults

Yves_Reynhout · June 5, 2013, 5:15pm

Hi,

I’ve read the thread on snapshotting and figured that the EventStore way of having snapshots is using an extra stream (that is, if I want to use the EventStore as my snapshot storage facility). Now, I’m trying to incorporate this guidance in AggregateSource[AS] (a little pet project that integrates with EventStore). I’ve come up with a couple of abstractions that will probably serve X% of the use cases out there but I have some holes that need filling. Hence why I turned here.

For one I should be able to rehydrate an aggregate’s root from a snapshot. So I added an ISnapshottable { void RestoreSnapshot(object snapshot); object TakeSnapshot(); } - one of many approaches I could have chosen - that an aggregate’s root must implement. No sweat.
When the time comes to load up an aggregate I resolve the snapshot stream name using a IStreamNameResolver { string Resolve(string identifier); } which is just an easy way for people to impose their own convention. Now, as I start reading the snapshot stream, it could be empty, deleted, not found, or found. I’m using the EventStoreConnection’s ReadStreamEventsBackward since I presume people will only want to read the most recent snapshot. [Q1]What do I pass in as a value for the start parameter? Int32.MaxValue or 1?
Assuming I get a slice of the stream back, I’ll have a snapshot (the sole event in that slice) I’ll be able to deserialize using my IResolvedEventDeserializer { object Deserialize(ResolvedEvent resolvedEvent); }. [Q2] Do people ever upgrade their snapshots like one has an upconverter for events? Intuitively, I’m inclined to say “just store a new snapshot” and not provide any support for such a thing. OTOH, if you were to deploy newer and older code side-by-side that could prove useful, but I doubt people would be using AS for that. I’d expect their integration with the EventStore to be much tighter than what I’m offering. Kind of a rhetoric question.
Now, I’ve gotten my snapshot and restored it into the aggregate’s root. [Q3]Where does one usually put ‘version’ information? With version I’m not talking about the version of the snapshot’s schema, I’m talking about the ‘version’ of the stream/aggregate this snapshot represents and from which point onwards events should be read. I would assume that it would go either on the payload itself or the meta data of the event. Alternatively I could provide a wrapper around the actual snapshot in the form of SnapshotInfo { Int32 Version { get; } Object Snapshot { get; } }. As it is I’m inclined to provide a SnapshotVersionSelector { Int32 Select(ResolvedEvent resolvedEvent); } which will allow a decent amount of flexibility (I may have to tune it a bit if deserialization were to happy twice, resulting in a SnapshotInfoDeserializer/-Reader/-Whatever).

I’m fully aware that this kind of integration is highly subjective. Not trying to ‘framework’ it, just trying to provide sane defaults (examples if you will) with a certain degree of pluggability (which is beyond the scope of the questions I’m asking really).

Thanks for reading,

Thanks a million for replying,

Yves.

jen20 · June 5, 2013, 5:32pm

Yves,

[Q1] I can answer quickly, I’ll have a think about the rest.

What you need there is StreamPosition.End, so your call would look like this:

var slice = connection.ReadStreamEventsBackward(“mysnapshotstream”, StreamPosition.End, 5, true);

Cheers,

James

Joao_Braganca · June 5, 2013, 7:14pm

What about putting a ‘Snapshotted’ event in the aggregate’s own stream (instead of a separate one)? Then you would always read this stream’s events backwards. If for any reason that snapshot can’t get de-serialized (wrong version etc) then oh well you have to read backwards through the entire stream. Your repo can detect this e.g. ‘100,000 events since last snapshot’ and take a new snapshot on save. Self correcting

Yves_Reynhout · June 5, 2013, 9:06pm

Well, concurrency for one. Snapshotting and appending would be competing for no reason. This maybe less of a concern with the EventStore. I can’t judge that. But true, a viable option.

Greg_Young1 · June 5, 2013, 10:06pm

Generally you want it in a separate stream. There are a few reasons for this. First you may have many snapshots for a stream (may be more than one projection for a stream). Secondly it introduces a coupling based on sequence number, you may never be able to asynchronously snapshot a fast moving stream as every time you snapshot new events have been written (a snapshot at version 4 is no less valid if it’s at version 6 now). Also you quite likely don’t want to keep all of your snapshots and can configure maxlength/manage on the other stream separately from your stream with events in it.

Cheers,

Greg

Greg_Young1 · June 5, 2013, 10:20pm

Sorry for autocorrect is should read maxage not manage below