Projection capabilities

I am heaving a hard time understanding what’s possible with continuous projections and what’s not.

The most simple problem I want to solve looks like this (all others problems are impossible to solve if this simple one isn’t possible):

  • I have a million users.
  • Each user writes his UI interactions into the stream ui-{userId}.
  • Let’s say there are (amongst others) the EventTypes start and stop in ui-{userId}.

No I want to create projections that result in the following behaviour:

  • Each time a user stops within 4 hours after starting an event is appended to leftearly-{userId}.

Is this possible with a (or some) continuously running projections without creating a million projections (which I assume is not a good thing to do) and without a projection that has a state which contains all the last starts for a million users?

And if this is possible (which I hope very much), how?

The way I see it, when a stop is received by the projection the projection must somehow get access to the last start event in the stream ui-{userId}. But maybe I just don’t see the simple solution here.

Thanks in advance!

Ahhh, foreachStream() is the magical concept, right?

foreachStream()

yes indeed it is

I am not sure how you can do it because the projections runtime is reactive, it is called for each event appended to the log, but there’s no scheduler there. So, the “after 4 hours” condition implementation for me sounds quite hard.

Luckily the condition is not “after 4 hours” but “when user stops earlier than 4hs”
So this pseudocode should do it:

.foreachStream()
.when(

    start: save datetime
    
    stop: if (now - {saved datetime} < 4h )
          then {emit ("to early")}

To help me with architectural choices::

How is foreachStrem() implemented?
Specifically: How is the state for each stream persisted?

Is it a problem to have a million users and save a few hundred bytes per stream as the state of the projection?

Thanks for your support!

heiner,

I think if you have a million streams needed datetime track, then you would have a partition stream per user (i.e a million new streams created)
That might not be a problem itself, but if you mutate the state (event something like s.count++) it adds a new event each time.

If I may, I’d actually get to the business case. It sounds like some sort of analytics, and tools like Google Tag Manager, or Kibana can do the work way better. Are you sure you want to build those things?