Has anyone had experience with how to deal with deployments in scenarios where 100,000+ events are placed into the Event Store in quick succession but have needed to make provisions in case there are issues with the events. I.e. providing ‘rollback procedures’.
Currently I am migrating from an monolithic architecture to an event sourced architecture where data is being periodically added at increasingly frequent times so need to have an efficient process for this if possible. There are two ways in which I can think of dealing with this.
Idea 1:
-
Shut down the Event Store and all microservices that subscribe to the Event Store before adding these events and take a backup of the Event Store, and the databases that back the microservices.
-
Start up everything and add the events into the event store.
-
If everything goes fine, then great. However if there are issues with the 100,000s of events that are placed into the Event Store then shut everything down and restore backups of event store, and microservices databases.
The plus here is that we can react quickly to issues with the events, but there is a lot of downtime and doesn’t seem to fit well with what Microservices are meant for.
Idea 2:
- Just let the events be added to the Event Store and if there are issues with them, then add correction events after that will fix any issues.
The plus here is that there is no down time, but you risk there being ‘data’ issues until correction events have been sourced, but I guess that’s a price to pay with eventual consistency?
Any preference of approaches that other people have seen?
Kind regards,
Mark