(Another) question on the appropriate usage of ES projections

Ryan_Palmer · June 12, 2019, 2:50pm

Raith, surely the answer to that is to ask the client what should happen in each case? Maybe at some point they will say ‘oh just fire off an email to say it went wrong, we’ll call them on the phone and deal with it’

It’s like Greg said in some of his talks, often the easiest and cheapest thing to do for the 1% of difficult edge cases is just to get a human involved, but ultimately the stakeholders make that decision as they hold the purse strings?

Ryan_Palmer · June 12, 2019, 2:51pm

by client there I meant the business owners, if that wasn’t clear!

Chris_Condron1 · June 12, 2019, 3:06pm

Hi Ryan,
This gets into fine grained vs course grained events and public private schemas.
I would not want to send all of the private events to the client to generate the report (esp. for a mobile app.) for both coupling and performance reasons.
So in general I would want a course grained summary event to be pushed to (or pulled by) the client for the report.

So, in a low volume system where I was looking to keep the deployed topology simple, using internal projections for this would work great

In general the transient state can be seen as outside the hexagon (hexagonal architecture) and what the clients should interact with, the core business events should be what the aggregates, process managers and CEP see and use.

note: The hexagon here is a logical division so the stream the internal projection creates is “outside”

Chris_Condron1 · June 12, 2019, 3:18pm

Hi Raith,

So if once the truck arrives it can’t be returned, then it is likely the cancel operation on the order can be rejected.

The crux here is that it is not a IT/Development call on how to make the system magically unwind, the business needs to answer the questions on what to do in each case.

That is the crux of the difference between a saga (or process manger) and a atomic transition. For a transition we can just say, everybody out of the pool and reverse everything. (mainly because we don’t let anything else happen while we’re in the middle of it.)

For a sagas as we go along business or real world things have happened so when we get to a failure point we need to clean all of that up. It is a trap to think of this as unwinding or recovering the previous state as those things will still have happened. We are going to want to know for example that the truck mile have 500 miles on the odometer we need to pay for the gas and time for the partial delivery and this client has a history of canceling.

Handling things problems in a sagas is moving forward not back.

And in many cases, phone a friend (or business user) is absolutely the way to go.

Chris_Condron1 · June 12, 2019, 3:20pm

transition => transaction

Ryan_Palmer · June 12, 2019, 3:20pm

Hi Chris,

That makes sense. I can see why this is a topic which generates so much debate, as there are many levels of data to navigate both physically and conceptually.

Greg_Young1 · June 13, 2019, 3:38am

Just include the last processed seq in the state same as any read model no?

Ryan_Palmer · June 13, 2019, 8:54am

Hi Greg

Do you think a docs section on the website with a brief ‘internal vs external projections’ comparison would be a good idea?

Perhaps it would risk being either too brief or too in depth as it is a multi-faceted topic as is apparent in this thread.

It is just one of the things I have struggled to square in my mind as a newbie. I know there aren’t any hard answers (as always, ‘it depends’) but some things to consider each way would be really useful.

I would offer to contribute but I don’t understand enough myself yet and would almost certainly muddy the waters even more!

Cheers,

Ryan

Ryan_Palmer · June 13, 2019, 8:56am

Or perhaps a blog post as you can dig a bit deeper there, I found the existing ones really useful

Raith_Munro · June 18, 2019, 11:46am

Hi again Ryan.

Thanks for re-igniting my interest in this - my organisation has suspended research on our microservice project, so I hadn’t given this much recent thought. Returning to it with fresh eyes, I have hit upon an explanation of why one should NOT use EventStore projections to materialise your read model or aggregates. This contradicts my previous answer as, at the time of suspending the project, we were still of the mind that we could accept the compromises as a price to pay for the benefit of convenience. But now I feel differently…

ES Projections are forward-only processing of the event streams.
Eventually you will want to change history.
Ideally you will leave the original event(s) in place and write new ones to cancel or correct them.
If you process the event streams by your own projection logic then you will start by reading the events backwards (newest to oldest) to ascertain what prior events have been superseded.
You can then take appropriate action when projecting the events forwards (oldest to newest).
In-process ES Projections don’t have this “look ahead” advantage.
Instead, you will have to rewrite an edited stream with events changed as necessary.
But that will also require that you reset the Projection.
Your Projection will most likely be materialising by category.
So the reset will cause all entities of a category to be rematerialised.
You are probably emitting these materialised states to new streams.
And likely subscribing to and triggering off of changes to these state materialisations.
But that means your entire category of entities will reset and replay their history of states.
You would need a means of disconnecting your downstream subscribers/cache/whatever, and then catching-up again (with whatever states have changed in the meantime - not just the one you edited).
And even if you can do that then you have still paid a high performance price for the sake of editing a stream.
By rolling your own materialiser/projector, you can process the events with the advantage of “look-ahead”, or rewrite a stream without affecting others.

We were originally tempted by the convenience of the ES projection engine, and a belief that it would be more performant than our own alternatives. However, the limitations detailed above (being forward-only, and likely being applied to the whole category upon reset) mean that this is no longer in favour.

If I have my facts wrong, or have arrived at a flawed conclusion then please do correct me.

Thanks,

Raith

Greg_Young1 · June 30, 2019, 8:19pm

I am very confused by this email.

ES Projections are forward-only processing of the event streams.
Any projection should be forward-only processing. If you are doing something else it is broken
Eventually you will want to change history.
Hmm…
If you process the event streams by your own projection logic then you will start by reading the events backwards (newest to oldest) to ascertain what prior events have been superseded.
Not really … Have never seen a valid case of this occurring. You would literally need an unusual case of something like snapshot events for this to even be remotely valid.
In-process ES Projections don’t have this “look ahead” advantage.
And they shouldn’t…

John_Lazos · July 1, 2019, 1:27pm

To my understanding, forward-only projection is part of the architecture. Since events are immutable, projected state ought to be cacheable. In fact, though i’m very new to ES and may have this wrong, I think thats where the checkpoint functionality comes in. The hit will be when we need to update the actual logic of the projection, e.g. considering a new event type when building up state. Then you’d need to reset the projection which may take time depending on how many events you have.

I’m trying to figure out at the moment whether to use continuous or one-time projections for read models. I’m not sure of the performance hit of having dozens to hundreds of continuous projections running. Also, state can only be read in a continuous projection by querying the result stream – I prefer the /state endpoint of one time projections. But then there’s the trade off of needing to re-run the projection whenever you want to update the read model to the latest events.

Greg_Young1 · July 1, 2019, 4:03pm

“I’m trying to figure out at the moment whether to use continuous or one-time projections for read models. I’m not sure of the performance hit of having dozens to hundreds of continuous projections running.”

For most systems, you don’t want to do this but want something external. Remember there isn’t any good way of querying etc your state etc would essentially be in a KV store.

John_Lazos · July 1, 2019, 6:23pm

My current world involves caching projections into a relational database. Which is fine, but a lot of extra work, so hoping a database purpose-built for event sourcing like EventStore can help get away from that.

“Remember there isn’t any good way of querying etc your state etc would essentially be in a KV store.”

So far I’ve found projection partitions to work quite nicely – I’ve built projections where the partition key is the id of the resource. If we need to do more advanced queries, I’d create a different projection that partitions in a way more suitable for that query.

Raith_Munro · July 1, 2019, 6:26pm

Hi Greg,

I bow to your wisdom (I literally do), but I still find it hard to accept your word that I will never need to write previously occurring events out of history.

By which I mean I will write a new event that nullifies a previous event, such that projecting my events causes the old one to be ignored/skipped.

And yet, despite my certainty that I’ll need this, I can’t put my finger on an example that I am happy to stake my case on.

I think I need help to overcome my mental-block by which I want to be able to change the past. Perhaps the thought that every potentially-regrettable action needs a thorny compensation event scares me.

But I think we agree that if someone DID want to rewrite history by either correcting an inappropriate event stream (diabolical, I know), or by recording a new event which nullifies an old one (which requires you to know about subsequent nullifications as you process the stream), then in-ES projections would become unsuitable for materialising aggregates (because ES does not support broken behaviour).

I’m going to give this a long hard think, and see if I can come up with any example where my “it was all a dream” pattern ™ is desirable despite its broken-ness.

Bobby Ewing

Greg_Young1 · July 1, 2019, 7:22pm

inline.

Hi Greg,

I bow to your wisdom (I literally do), but I still find it hard to accept your word that I will never need to write previously occurring events out of history.

By which I mean I will write a new event that nullifies a previous event, such that projecting my events causes the old one to be ignored/skipped.

And yet, despite my certainty that I’ll need this, I can’t put my finger on an example that I am happy to stake my case on.

It is common to write an event that nullifies a previous one. Projections however generally do not skip it they do it then undo it at the nullification/reversal. There is a material difference here. Let me propose a concrete example. You working at the bank fat finger a transfer to my account $1000 instead of $100.

It is discovered tomorrow that this occurred. We don’t remove that the issue occurred! That the issue occurred could be quite important if I also used $800 today from my account to secure a line of credit from the bank … no? How did I do this?

There is also the question of running the projections historically … If you run it until this morning, what should you see? What was the balance as of 0800?

there is an entire other can of worms here in that very often the correction will be written today but marked as applying to yesterday, in other words predated which begs the question of which timeline we are interested in … no? this is a bitemporal problem, n-temporal is common

I think I need help to overcome my mental-block by which I want to be able to change the past. Perhaps the thought that every potentially-regrettable action needs a thorny compensation event scares me.

But I think we agree that if someone DID want to rewrite history by either correcting an inappropriate event stream (diabolical, I know), or by recording a new event which nullifies an old one (which requires you to know about subsequent nullifications as you process the stream), then in-ES projections would become unsuitable for materialising aggregates (because ES does not support broken behaviour).

Projections would not be an issue, they could even show it was broken for this period of time (maybe even a specific projection showing me “where were accounts broken due to corrections occurring!”). There is even a common pattern for this which is a nullification event. Projections in ES do not have internal support for this (needs to be done outside via a link but possible). That said ES-projections should not be used for materializing aggregates!!! If you are interested why we can discuss more but its a really bad idea.

Raith_Munro · July 1, 2019, 10:13pm

Thanks again Greg,
I completely get your example. I’ve followed your ‘as at vs. as of’ explanations. I think it’d be great if all corrections could roll-forward with a nullification event. But I still believe that in some cases, in some systems, it would be neater to retro-actively skip the fat-finger error rather than have to undo it down the chain - for simplicity and to avoid the temporal can-o-worms. Whilst this is a pattern I would like to explore further, it will have to wait because…

…you have just arrived at answering the OP question, which is much more interesting to me/us.

Ryan asked:
Most of the posts I find on this topic either end with another question like ‘why / what is this for’ which doesn’t get answered or a ‘you can do x y or z but I wouldn’t’ with little explanation what you would do instead and most importantly why.

Greg answered:
ES-projections should not be used for materializing aggregates!!! If you are interested why we can discuss more but its a really bad idea…

We’ve been trying to advise each other, pooling our limited experiences (speaking for myself here, not all the other posters) and sharing our observations and opinions. I’m/we’re not entirely bad at this, but I/we are struggling to accept that ES projections shouldn’t be used for materialising aggregates. It still seems to me/us that it’s a plausible solution. I’m not arguing against GY, it’s just not enough to be told ‘no’… I can’t help it… I gotta hit that big red “Really Bad Idea” button. Coz I’ve simply GOT TO KNOW WHY.

So, please PLEASE, if it’s possible to explain why, then this is the thread for that clarification.

He holds his breath, and pushes the button…

Raith_Munro · July 1, 2019, 10:41pm

I’m already off the idea of using ES projections for materialisation - but my arguments against were flawed, being based on my “broken” implementation.
And Joao contributed the argument that ES projections are only eventually consistent - but it sounds as though we can avoid being misled by tracking the seq number.

I don’t think it’s number of Projections or Streams - especially in an environment with a modest data size and low event throughput.

I wish I could realise it for myself without being spoon-fed the answer, but it’s just become so blinking frustrating.

What is the limitation of ES Projections?

Why is it a Really Bad Idea?

Thanks in advance…

Greg_Young1 · July 1, 2019, 11:45pm

Responding from phone so forgive typos etc. There are quite a few reasons not to though you can.

The single largest is you have no way of querying. While I have dealt powith some systems this applied to it has been rare. If I have my customers as state how do I see all in Mississippi? It’s possible but very rarely useful…