We have recently started looking at refactoring our current monolithic system to microservices (hosted in Docker) using EventStore as a message bus.
The main question that we are trying to answer is: can we design a system using EventStore to prevent, as much as is possible, any loss of business data?
One scenario that we have discussed is where a service is consuming (assume here catch-up subscription) events from Stream A, performing a resource-intensive calculation (for example), and publishing events to Streams B and C that are, in turn, subscribed to by other services. However, in between an event appearing on Stream A and the resulting events being published to Streams B and C the connection to EventStore is lost and a circuit breaker keeps these events on a queue. Then to make things even worse, the container with the service crashes meaning that the events are lost without having been published to EventStore.
Is there a recommended pattern for dealing with this kind of scenario? We discussed the possibility of each service publishing a ‘receipt’ for each event consumed after successfully publishing to Streams B and C so that if the service or the container crashes we can then have a new container with that service check the last event successfully processed and begin from there. However, what if the service/container crashes between the event being published to Stream B and it being published to Stream C?