Aggregates Snapshot in EventStoreDB

Vlad_Andronache · December 16, 2020, 5:58pm

Hi there,

I was thinking how and if Snapshot is implemented in EventStoreDB ? Essentially we have aggregates that reach thousands of events and so the reconstitution now starts to last for tens of seconds. So we figured it’s time to introduce snapshots.

I did find this link: https://stackoverflow.com/questions/16359330/are-snapshots-supported-in-eventstoredb

and it makes sense, but it seems kind of hacky, I was thinking if maybe there’s something cleaner and/or more transparent (for example, keep the same Stream Name, that would be great ).

Thanks
Vlad

oskar.dudycz · December 17, 2020, 3:21pm

@Vlad_Andronache if you’d like to keep all inside EventStoreDB I think that solution you posted is the best option. You can set for the “snapshot stream” $maxCount option to 1 then you only have the single, latest snapshot. See more details in: https://developers.eventstore.com/server/20.6/server/streams/metadata-and-reserved-names.html#stream-metadata.

The other solution I can think off is to publish Summary Event into the same stream and read backwards (https://developers.eventstore.com/clients/dotnet/generated/v20.6.1/reading-events/reading-from-a-stream.html#reading-backwards) until you find the exact event type. Read more in: https://verraes.net/2019/05/patterns-for-decoupling-distsys-summary-event/.

I’d recommend also consider rethinking the aggregate structure - maybe you could split the aggregate into a smaller portion. E.g. using E-Commerce example: instead of having long-living Order Aggregate you could split it into Cart, Order, Payment, Shipment. You could then have each of the aggregates as separate streams.

They could share the id but have different prefix “{prefix}-id” eg. “Cart-123”, “Order-123”, “Payment-123”. You could also use link event if you’d like to keep more correlation and if needed get all events.

I’d recommend thinking about some archiving strategy - eg. in Bank Account you need to keep track of the transactions that occurred in the last year. Other could be archived (moved to the backup or other stream - eg. “Archived-123”.

alexey.zimarev · December 18, 2020, 4:04pm

I actually don’t think that using a different stream name is “hacky”. Quite the opposite, entity streams suppose to hold the entity behaviour only. Snapshots are a technical concern, they have nothing to do with the domain model. Therefore, keeping them outside of the entity stream makes perfect sense. You can then even delete the snapshots stream and still be able to restore the entity state solely from events, even if it will take more time. This approach, by the way, makes total sense if you think about changes in your domain model. When your events are immutable and you would desire to keep the state rehydration logic from old events as it was to understand the previous behaviour of the model, versioning snapshot is meaningless from the domain model perspective. It would be much easier to remove those snapshots, restore the entity from all the events when it first loads and produce the new snapshot, with the new schema.

In addition, Oskar is completely right. Under usual circumstances, long aggregate streams seem like a smell of wrong aggregate boundaries. There are legit cases (a limited number of those) when streams become literally endless. But even in such situations, there should be some reconciliation process. Like in warehouses they count products stored from time to time, just to bring the inventory in check with the reality. Then, they record the actuals and write off the losses. That is the “new start”, so the old stuff can be archived. This pattern is also known as “closing books”, the analogy with accounting, when it happens every financial year. Same goes for banking as accounts aren’t aggregates with unlimited history. Hence you can’t get a list of transactions in your internet bank many years back in time. At some point, you can only get a PDF file with the bank statement, as all the rest gets archived.

Vlad_Andronache · December 18, 2020, 4:24pm

I maybe subjective with ‘hacky’ because if I do separate streams then I have to get a bit more complex with reading and reconstituting my aggregates. in that I also have to search first for the correct stream.

At the moment we have it simple, one stream for aggregate with {aggregateName}-{Id} as the naming convention. That’s why I am more inclined and actually will, for the time being, implement @oskar.dudycz 's suggestion with the ‘Snapshot’ event and reading the stream backwards - at the moment this is the most time effective and easiness of implementation fix for our current issue.

As for both remarks that aggregates should be short lived and not have infinite number of events, I agree 100% but this is out of my hands at the moment.

Thanks for the feedback !

Vlad

alexey.zimarev · December 18, 2020, 4:26pm

Just remember that we’ll have issues with versioning your snapshots, in addition to normal event versioning, when your model changes.

Vlad_Andronache · December 18, 2020, 9:04pm

@alexey.zimarev I’m actually having a bit of a problem with understanding your point with versioning. Would you mind if I did a sample github repo with my current implementation of reconstitution and if I point out some of the concrete problems I’m having with versioning (also of events) ? . Then we could discuss it over that with some concrete questions I have.

Versioning of events (and now, of snapshots) is something we neglected up until now. Last time I did research on this, was about a year ago, but it kind of slid away in the meanwhile

oskar.dudycz · December 18, 2020, 9:29pm

@Vlad_Andronache when you start to use snapshots then when you make changes to your model structure snapshots may become obsolete. You’re ending up needing to rebuild snapshots which is not trivial (especially if you keep it in the same stream). When you have it in separate, then you need to “just” really all events and publish it to the second stream (as I proposed as the first suggestion). Then old snapshot will be replaced with newer one.

That also leads us again to the lifetime of aggregate/stream. The shorter living aggregates you have, the fewer issues with versioning you have (as you usually don’t need to care so much about the streams for deleted records).

The other is what @alexey.zimarev mentioned - strategies for reconciliation/merging events. So you could consider making those “snapshot events” more of a business-related checkpoint (e.g. end of the business month). Then you could archive older events if those checkpoint events contain all needed summaries.

Greg_Young1 · December 20, 2020, 3:22am

Wow the UE for this is terrible, seeing what I am typing is next to impossible I see only 3 lines at a time.

You almost always want to use a separate stream. Using the same stream for snapshots leads to weird complexity. Let me explain.

If you were to use the same stream what happens when your snapshot changes? Certainly over time your concept of the snapshot will change, no? If your snapshots are in the same stream how will you deal with the changes over time?

If they are in another stream you can easily just … change the stream. I can say that we have fubar_snapshot_v1 and I can have fubar_snapshot_v2. If I am forced to keep all of them in the same stream how will I handle versioning?

Vlad_Andronache · December 20, 2020, 7:21am

Ok but what difference does it make if say. Today I create a snapshot by appending fubar_snapshot_v1Event to the original stream… the one month passes by and I have

Historic event 1
Historic event 2
fubar_snapshot_v1Event – this is basically the serialized aggregate as it was on the snapshot date
event_after_snapshot1
event_after_snapshot2

. From the date #3 happened onward my reconstitution logic now reads backward starting with the last event (#5), and stops at 3 because it would detect a snapshot event. Now everything is fine, say tomorrow I have a new version… then I append #6 on the list with fubar_snapshot_v2Event, and the cycle starts again, I would then again read from the end and stop at 6 , because it’s a snapshot event.

Technically my old model that is concerned with what’s between #3 and #5 is no longer existing, also the one concerned with #1 and #2. But I also don’t care, because I stop my reconstitution at #6, so, kind of the same thing as if I would have 3 streams – the old events are ‘archived’ in that I don’t read them back anyway. Anyway the actual model may still be existing fine, and be used for new aggregates, that would start their lifetime with the same historic event 1 and 2 (say OrderCreated and Order Updated)

On the other hand if I had 3 streams I would have to either mark it somehow in my Model that I am supposed now to read stream #3, or do a continuous read of streams until I find the ‘last one’.

Am I missing something else ? I am not trying to be stubborn here, but trying to understand 100%

I am thinking other reason for having separate short lived streams would also be the inner workings of EventStore, supposedly we would hit issues in the future if we keep on using long streams, say if we would start using different functionality that we’re not at the moment ?

Vlad

oskar.dudycz · December 22, 2020, 1:19pm

Snapshots structure tends to change quite often. Each new event or change to how the event is applied or interpreted may change the snapshot schema - e.g. Initially in the Order snapshot, you were keeping just the product items count, as it was enough for the business logic. Still, now you’d like to keep the collection of product items). Now you need to reapply the events and old snapshot becomes obsolete.

Of course, you can keep the “schema version” in the snapshot and ignore the events with older “schema version” than the most recent. Still, in my opinion, this is much more complicated than just storing single, the most recent snapshot in the separate stream. If you put the $maxCount condition on the snapshot stream, then what you need is to get a snapshot and read all events that happened after the revision stored in the snapshot. If you store the new snapshot version, then the old one will be “replaced”.

The perspective of what’s easier and what does not depend on the view, so if you see that as more straightforward, then it’s okay - it’ll work. You can always try both solutions doing smaller PoC and see what suits your case better.

Regarding having all the needed information after storing snapshot - it depends on what you’re doing with events and what information they carry. If they’re as in your example OrderCreated or OrderUpdated - so upsert-like then yes, the snapshot will be enough.

However, if you have more granular, domain events like OrderInitiated, OrderConfirmed, ProductItemAddedToOrder then it may be more complex. Suppose you’re using them to trigger some business operation or update projections. Having just a snapshot instead of a series of such event will remove some of the business information. You won’t be able to get the value if you archive old events. So for that case, I’d suggest putting new events in the new stream publishing the event with a link to the old stream. Then projection can handle all events in case of the need to rebuild them.

There is one more aspect - the size of the storage and performance - if you keep snapshots in the same stream, and you’re doing snapshot, e.g. one per three events. Then for 10 000 events in your stream, you have 3000 snapshots more. If you use separate stream, then you can have a single snapshot event.

Short living streams is not based on the inner workings of Event Store. It’s a general best-practice that’s commonly suggested in all event sourcing solutions. Read more in my blog post: https://event-driven.io/en/how_to_do_event_versioning/.

@Vlad_Andronache did my explanation help?

oskar.dudycz · July 16, 2021, 10:57am

Folks, I wrote two articles to sum up, thoughts on snapshotting:

Snapshots in Event Sourcing - https://www.eventstore.com/blog/snapshots-in-event-sourcing - a general introduction that also explains why you should prefer to keep streams short-living instead of snapshots.
Snapshotting Strategies - https://www.eventstore.com/blog/snapshotting-strategies - explaining the considerations and giving the practical code samples how to do them if you have to.

I’m planning to write a third part with the practical samples for the “Closing the books” pattern. I’ll post a link once it’s out