Streams with active data

I’ve got a system that I’m nudging towards full eventsourcing slowly but surely.

Today there are two databases, Live and Archive. While data is active it’s in the Live db, when it’s done it’s moved to Archive, and is considered readonly (except for audit info).

If we open to the door to using EventStore I don’t really see a benefit in the archive procedure anymore (and to be honest, it’s not exactly my idea to do it in SQL either!).

Question is, how do we subscribe to a stream that only contains the ‘active/live’ data? The only way I can think of is maintaining a readmodel with the streamids, and then subscribing to them one by one, but this seems convoluted. If we move it to a second EventStore as part of archiving we could delete old streams and subscribe to $all I guess, but that would tie us to the implementation of using two stores.

Any pointers?

/Peter

  • How are you differentiating between live and archive ATM? In other words: what does “when it’s done mean”? Are there particular events that denote the end of a stream (e.g. you have an explicit DeletedXYZ event or Year/QuarterClosed event) or did you choose to actively archive streams using a ArchivedXYZ event?
  • Out of interest what is the motivation for having a stream of ‘live streams’ (pardon the pun)?

The simplest thing that comes to mind: Maybe you could move the stream from live to archive and just delete the stream on live when stream archival succeeded? Don’t know whether deleted streams surface in the $all stream. If they do, this won’t work.

  • How are you differentiating between live and archive ATM? In other words: what does “when it’s done mean”? Are there particular events that denote the end of a stream (e.g. you have an explicit DeletedXYZ event or Year/QuarterClosed event) or did you choose to actively archive streams using a ArchivedXYZ event?

There’s a ‘complete’ event, and then archiving is done async. (we don’t use eventsourcing terminology as the former lead dev wasn’t very comfortable with any concepts outside of the ms/sql stack, but thats what they really are).

  • Out of interest what is the motivation for having a stream of ‘live streams’ (pardon the pun)?

Removed a paragraph before posting, as I thought it was redundant, but I now realise it had all the important info in it :). A couple of reasons:

  • We maintain running statistics, pushed to clients on a regular time interval. It’s now handled through a sql query loading up a good chunk of the live db, projecting into a queryable, in memory data structure, from that point updated from ‘events’.

  • All clients are realtime, meaning all changes go to all clients within a specific partition. There’s some redundant optimization at the moment, but in the future I think we could simplify a lot by just keeping all state on the client, active events are in the thousands, 10-20k max, memory should not be an issue.

  • The system is in healthcare, so it’s vital we have consistent data asap, having one event stream would simplify that a lot (clients can detect when they’ve missed something, ask for everything from a checkpoint etc, there are a couple of different mechanisms in place for that now, would like to just have one)

The simplest thing that comes to mind: Maybe you could move the stream from live to archive and just delete the stream on live when stream archival succeeded? Don’t know whether deleted streams surface in the $all stream. If they do, this won’t work.

Yep, but like I said, that would force us to always have a second eventstore, behaviour would be tied to infrastructure. This is configurable in the current implementation.

/Peter