Temporal Stream Architecture Idea

Gannon · December 27, 2022, 2:43am

I had an idea for how to handle temporal streams and my POC is working really well.

I realised that most of the time you are only interested in streams that have been created within some distance in the past. For example, let’s say you had 10 years worth of data on gps positions. You might only be interested in gps positions that have occurred within the last couple months, up until now. You would need to create a read model that had a moving time window. It would be forever shifting forward to only display information that had happened within this historical distance. It might refresh that window every couple of days, so you aren’t constantly rebuilding your projection.

To create this window, I have implemented a stream naming convention that prefixes streams with an epoch seconds. I then create a numeric regex range filter so I can exclude all streams that fall before this range.

To include all streams that have occurred within the last couple months and all future streams, this is an example of the numeric regex range that is generated at runtime:
(166121846[3-9]|16612184[7-9][0-9]|1661218[5-9][0-9]{2}|1661219[0-9]{3}|16612[2-9][0-9]{4}|1661[3-9][0-9]{5}|166[2-9][0-9]{6}|16[7-9][0-9]{7}|1[7-9][0-9]{8}|20[0-9]{8}|21[0-3][0-9]{7}|214[0-6][0-9]{6}|2147[0-3][0-9]{5}|21474[0-7][0-9]{4}|214748[0-2][0-9]{3}|2147483[0-5][0-9]{2}|21474836[0-3][0-9]|214748364[0-7])

This is then used in conjunction with some other regex filters to the StreamFilter.RegularExpression function when I create the SubscriptionFilterOptions.

alexey.zimarev · January 2, 2023, 11:25am

That’s one approach, and if you want to have events for a particular time range back in time, you can just set the stream TTL and run the scavenge regularly, so old events will be purged. It will work unless you really want to keep the data. I would also not recommend keeping all the data in ESDB as it’s an operational database, not a place to keep archived data.

Gannon · January 6, 2023, 12:35am

This response has warranted a discussion amongst the team. We have overlooked Event Store not being a long term storage option. It does still fulfil 2/3 requirements that had originally led us to choosing Event Store; APIs for event sourcing (best in class), and flexibility to change data types/entities without friction.

alexey.zimarev · February 4, 2023, 12:48pm

I mean, it is as good for long-term data storage as any other operational database like Postgres or MongoDB (without their new archive feature). It just might be too expensive as everything will be in the hot storage tier.