Enriching events - would this break EventStore?

steven.blair · April 8, 2020, 6:56pm

Take this small example of a Sale

SaleStartedEvent { startedDate : whatever }

LineEvent { id : 1}
LineEvent { id : 1}
LineEvent { id : 1}
LineEvent { id : 1}
LineEvent { id : 1}

SaleCompleted { completedDate : whatever }

We want to make it easier for consumers of SaleCompleted to get information about the Sale which completed (number of items, voids, promotions etc)

Here are the options I think that are available, and wouldn’t mind hearing other people’s views

Option 1 - Projections

Each sale has it’s own stream, so we could use a partition for each Sale.

if the event being processed in the projection is SaleCompleted, transform into EnrichedCompletedEvent (we would hold the other parts of the Sale in the state)

Does a partition run in-memory at all times?

Say we have 20 million Sales, would we have 20 million partitions in memory for ever?

Or (fingers crossed) are partitions only held in memory while access, then they drop off at some point.

The partition streams which are built by the projections, would they be truncated eventually, and what would they take up a lot of disk space (remeber we are talking millions of streams)

Option 2 - Enriched Completed event

When we persist SaleCompleted to the EventStore, we just enrich it then.

It’s very simply, and we have all the information we need to add in.

Anyone susbcribing to $et-SaleCompleted gets this nice enriched event.

Downside is if we need to change the shape of this event. Realistically, only the events going forward would

have the changes. It wouldn’t be easy to replay (we might be able to frig something with a projection)

Option 3 - Consumer asks for latest Sale on receipt of Completed event

We send out our slim SaleCompleted to consumers. if they want more information about the sale, they then hit the BC REST endpoint

GET api/sales/:id

Where we rehydrate the Sale, and return the latest, meaning if we make any changes they always get the latest payload.

The downside here is the amount of traffic hitting the Sales BC. Again, we are talking million of sales flowing through the system,

and each time a Sale is completed a further round trip (per consumer) would be required.

The REST endpoint and the EventStore would be taking a beating here.

Sean_Farrow · April 8, 2020, 7:39pm

Hi,

Is using a separate read side an option?

If so, could you subscribe to a category stream?

Sean.

steven.blair · April 8, 2020, 7:52pm

Sean.

How would you see that working?

One thign we are trying to avoid is a read model needing any sort of intelligence to to enrich the data.

So a read model could listen for all the events to the sale, and sticth together what happened.

Is that what you mean?

Sean_Farrow · April 8, 2020, 8:18pm

Yes, that’s exactly what I mean.

This does assume the events contain all the details the read model required and you don’t have to fetch data from any other system.

Thanks,

Sean.

steven.blair · April 9, 2020, 7:18am

Sean,

That’s pretty much what we are doing just now, and hitting some problems where the read model needs to know more I think it should (i.e. some calculations that the domain holds onto)

So from that perspective, we can either go back to domain when the CompletedEvent arrives and collect the latest state, then persist to read model (hosing our domain + eventstore) or make our SQL smart enough to enrich the data, but it leaking out knowledge into the read model.

I appreciate there might not be a perfect solution, but just curious as to other people’s views.

Sean_Farrow · April 9, 2020, 11:26am

Hi Steven,

Neither of these are great imo, is there a reason the read model needs state from else-where?

Could the state be included in events, or is there a missing piece to the domain? Or possibly this is guiding you towards a new bounded
context?

I’m just speculating of course.

Thanks,

Sean.

steven.blair · April 9, 2020, 11:36am

The extra state just helps with the query side of things.

Without that our query side become overly complicated.

Our domain has all the pieces, but the trick is how we get that to the read model.

Just now, the most desriable option is to enrich the event we spit out from the domain with all the nice stuff in it, which really simplifies the read model.

This of course has the problem I mentioned earlier if we ever needed to replay the sales again we would get same enriched spat out.

if we stuck with the slim Completed event, the read model could then pull the latest state, so replaying would mean we could make changes easily.

The downside is the extra strain it puts on the BC with all the extra hits.

Just to make sure we are on the same page here:

Enriched event:

*{ *

“saleId” : “2DE5BAC5-880A-4DB7-B602-91754D6D8043”,

*"amountToPay " : 9.56, *

“amountPaid” : 9.56,

“changeGiven” : 0

“paymentType” : “EFT”

“promotionAmountApplied” : 0

“amountToPayWasVoided” :{

“amount” 19.56

“reason” : “do we even have this?”

“anything else” : blah

}

When the read model gets this, it simply dumps into a flat table, and means we cna easily query on “Give me all sales with were voided” etc

Compare that with the slim approach

{ “saleId” : “2DE5BAC5-880A-4DB7-B602-91754D6D8043” }

When this arrives at consumer, they would then do:

*GET api/sales/*saleId : “2DE5BAC5-880A-4DB7-B602-91754D6D8043”

And return whatever shape of state we decide, meanin if we need to fix a bug in the domain, we can simply replay our slim completed event, which instructs the rm (and anyone else) to GET the latest details.

Peter_Hageus · April 9, 2020, 11:40am

Looks like a lot of the ‘enriched’ properties depend on business rules, that can change. I wouldn’t consider that enrichment, but part of the event.

Amount to pay can depend on temporary discount rules etc. I always include similar properties in events. I know the arguments not to do it, but the advantages overweigh them by far imho.

/Peter

James_Woodley · April 9, 2020, 11:41am

My worry with that approach would be you’re ending up attacking your own API. Every event (sale in this case) you raise will also make an API call??

Have everything your read side needs in the event, if that changes based on other factors then your event isn’t immutable and those changes should result in events of their own that can then update the read model accordingly?

steven.blair · April 9, 2020, 11:44am

James,

Exactly our concern. Each time we persist a Completed event, it would then result in at least another call (depending on the number of consumers) back into the API to grab the other data regarding the sale.

steven.blair · April 9, 2020, 11:50am

Peter,

But what would happen if we chanegd the rules in the domain, or even something crazy like there being a bug in the domain lol

If we have this enriched event, based on whats happened already, we can’t really fix that now can we?

it’s like we are trading off versus helping consumers against the ability to replay through our streams and change / fix.

Could another solution be using partitions on ES?

I really don’t know the overhead of using them, esepcially if we are talking about millions.

Imagine something like:

fromStreams(’$ce-Sales’)

.partitionBy(function (e)

{

return “Sale-” + e.data.saleId;

})

‘SaledCompleted’: function(s,e){

//we know this is the last event, so we can either emit or transformBy and build an enriched event

},

$any: function(s,e){

//store state from events building up our sale

}

If we are talking about constant use, what would this do to the ES (CPU, memory and disk etc)

Peter_Hageus · April 9, 2020, 11:54am

Peter,

But what would happen if we chanegd the rules in the domain, or even something crazy like there being a bug in the domain lol

If we have this enriched event, based on whats happened already, we can’t really fix that now can we?

Yeah, that is one of the arguments against. But it accurately reflects the state of the domain at the time. There’s always compensating events…

/Peter

steven.blair · April 9, 2020, 12:48pm

*There’s always compensating events… *

How would you see that working?