I want to be able to subscribe to all user created streams and exclude the ‘$’ system generated ones.
My code currently looks like:
_subscription = _connection.SubscribeToAllFrom(position,
false,
EventAppeared,
ProcessingStarted,
SubscriptionDropped,
new UserCredentials(eventstoreUsername, eventstorePassword));
``
And in EventAppeared I’m checking for system generated streams and ignoring them.
if (resolvedEvent.OriginalStreamId.Contains("$")) return;
``
But I’m sure there must be a way to subscribe to all excluding the system generated streams.
There isn’t, unless it’s been added recently.
You could do it yourself by making a "all" projection but there are
performance penalties with this.
Are you worried about the cost of bring over system events and ignoring them or?
Here’s my scenario (for a bit of context).
My website, on startup, reads all streams/events and creates in-memory read models.
This is working great so far with over (an estimated) 10,000 user created events.
I’m just wanting to make the time from app start to app ready as fast as possible. I know that Eventstore itself will probably generate a large number of its own events overtime which could affect my initial start time.
Think about storing the in memory model to a file and load it on start. We
use kryo for cheap simple in memory models. That will speed up the startup
time by a lot more. If you change the model just delete the snapshot file.
When loading more than 30.000 events this is necessary in my use cases.
The main one we use thats of substantial size is $statistics which you
can limit the number we keep using $maxAge/$maxCount
10,000 events should be very very fast to start up either way (on the
order of < 3 seconds for an in memory model)
Thanks Greg - Maybe it’s something I don’t really need to worry about then. I’ll look into setting the $maxCount for stats.
Thanks Idar - If the startup time ever becomes a problem I’ll revisit your suggestion.
Idar is right, if you find that start up times become unacceptable (it takes a lot for that to be the case) then you’ll need to persist your read model. I had a project where startup times grew to over 5 minutes. It was only then that I went to persisted model (startup became instant) and it was really easy to do. I persisted to Sql Server and had a separate process maintain the readmodel, this way multiple processes could look to the same read model regardless of starting/stopping. Of course every case is different. If you have only one process, or don’t care about 100% consistency across multiple processes, catchup-subscription persisted to a file is the way to go.
Just out of interest, how many events did you have when you reached the 5 minutes startup?
Also, did you have your eventstore and website running on the same box?
I’m currently running in azure with eventstore on a different box to the site. I’m sure latency will be a big factor.
Have you benchmarked this?
I’ve left that company and I never actually ran the metric but I’ll take an educated guess, ~750 new streams a month * 12 months * ~1200 events per project = 10.8 million events. That’s not counting the $ streams that were ignored.
The first time I hit >5 minutes real quick because I was storing pdf BLOBs directly in my events, probably a dozen ~50KB events in each stream. The sheer amount of data crossing the wire and being deserialized and shuffled around in memory was a huge drag. I reconfigured the system to store hashes of the files in the events, whereas the real file data was offloaded elsewhere. It helped started times for a while, until as time wore on, the # of events grew quite large after about a year. That’s when I went to a persisted read model. It worked perfectly when I left, except when you had to rebuild the read model (that took hours, but could be done separately)
I’ll benchmark over the weekend and report my results.
No they were all on different servers, but on the same network (in fact, some may have been on the same physical hardware as it was a VMWare ESX environment and they were all virtual machines). I didn’t handle that infrastructure component.
Startuptimes are very dependent on what you do in your readmodel. The eventstore is always much faster than your processing. I have projection which read 2.5 mill events in no time and projections which takes around 15-20 minutes with 2.5 mill events.
Thinks Chris. That really puts things into perspective. It’s way more events that I’ll ever have to worry about
You’re welcome. By the way, I wrote that $all projection code before I fully understood the power of the built-in $by_category & $by_event_type. Since your original question was regarding NOT having system events come through, I suggest you take a few hours and play around with those and see if you can work up a way to have a single stream with all of the events you need linked from there. It would have satisfied my requirements had I known about it.
Some benchmarks as promised (in azure):
–Test 1–
–Evenstore VM–
Windows VM Running Eventstore
A1 (1 core 1.75 GB memory)
–Website–
Small instance
(1 core 1.75 GB memory)
15:22:47 App startup
15:23:48 Eventstore subsciption subscirbedfrom position 0
27.85 seconds to read events from store
Positon 85827885/85827885; events processed 104528; systemevents Ignored 753
Dispatching Events
25000 of 104528 Processed
50000 of 104528 Processed
75000 of 104528 Processed
100000 of 104528 Processed
172.52 SECONDS APP READY
----Test 2-----
–Evenstore VM–
Windows VM Running Eventstore
A1 (1 core 1.75 GB memory)
–Website–
Large Instance
(4 cores 7 GB memory)
OK and if in your replay you just say x++ to count your events?