Best practices to handle exceptions in projections

I’m trying to define best practices for complex scenarios, when something can go wrong in my projections. Let’s image I have an aggregate ‘Account’, and I store events in one stream per account.

Then, I have two projections. One projects data to DynamoDB. The second projects data to ElasticSearch to perform complex queries more efficiently.

  1. What would it happen if one of the projections is faster? How can I sync these two projections? If I don’t do so, it might happen that I get search results in ElasticSearch, but I don’t have data yet in DynamoDB. I guess I could have both as part of the same projection, but as far as I know, best practice is having one projection per read-model.

  2. I’m using checkpoints for these projections. How should we handle errors while projecting data? Imagine I have a problem projecting data for Account_1 to DynamoDB. Can I stop projecting data only for Account_1? Should I stop projecting data for every account?

  3. If I have a problem projecting Account_1 to DynamoDB. Should this affect the projection for ElasticSearch? What is the best way to handle these errors?

Thanks,

Alberto

Alberto,

From my experience, it sounds like you should emit an event to an end point, and that endpoint then pulls the data it needs from the Eventstore, then update your persistence.
It might not be anywhere near as efficient as the current model, but if both read models need to remain in sync, I can’t see another way of doing it.

Something like this

  1. emit event
  2. read model endpoint consumes, and pulls any additional data from eventstore
  3. Perform updates to both read models (rollback if one fails?)

What would it happen if one of the projections is faster?

I’d expect Elastic to always lag behind due to its ingestion speed (it’s not great of you update one by one). Usually, you want to batch Elastic ops and execute them in bulk, which gives you a natural delay. Plus, Elastic will take its time to do the indexing, so search results would highly unlikely be available faster than anything else (unless the Dynamo projection fails).

You can also put both projections in the same subscription, so they will always be in sync. Bu then you need to decide where to store the checkpoint.

Can I stop projecting data only for Account_1?

I don’t think it’s possible. Projections, in general, rely on the global order of events.

I can see two types of potential failures in projections:

  • Bug in the code: you need to decide on the strategy. Either you log and ignore the error for the sake of keeping the projection running. Set up an alert on that log message and get properly notified when the alert gets triggered, then decide what to do.
  • Transient infra/network failure: use retries (endless, maybe embed some circuit breaker to handle the backpressure issues)

If I have a problem projecting Account_1 to DynamoDB. Should this affect the projection for ElasticSearch?

Again, if you put them in one subscription, they will either both run or both get stuck. It depends on the business. I don’t think it’s a technical issue.

I also used a different pattern when I used to project to Elastic. Elastic generally sucks on updating individual fields in existing documents. So, instead of handling complex projections from ESDB to Elastic, we projected to MongoDB (we needed it anyway) and subscribed to Mongo ops stream, projecting complete documents to Elastic instead. It should be quite easy to do with Dynamo since it has a few options to listen to changes there (with Kinesys or without). It would solve all your concerns, basically.