Subscriptions that exclude system steams

Marc_Flerin · December 19, 2014, 11:25am

I want to be able to subscribe to all user created streams and exclude the ‘$’ system generated ones.

My code currently looks like:

_subscription = _connection.SubscribeToAllFrom(position,
false,
EventAppeared,
ProcessingStarted,
SubscriptionDropped,
new UserCredentials(eventstoreUsername, eventstorePassword));

``

And in EventAppeared I’m checking for system generated streams and ignoring them.

if (resolvedEvent.OriginalStreamId.Contains("$")) return;

``

But I’m sure there must be a way to subscribe to all excluding the system generated streams.

Brian_Mavity · December 19, 2014, 1:58pm

There isn’t, unless it’s been added recently.

Greg_Young1 · December 19, 2014, 2:02pm

You could do it yourself by making a "all" projection but there are
performance penalties with this.

Are you worried about the cost of bring over system events and ignoring them or?

Marc_Flerin · December 19, 2014, 2:16pm

Here’s my scenario (for a bit of context).

My website, on startup, reads all streams/events and creates in-memory read models.

This is working great so far with over (an estimated) 10,000 user created events.

I’m just wanting to make the time from app start to app ready as fast as possible. I know that Eventstore itself will probably generate a large number of its own events overtime which could affect my initial start time.

Idar_Borlaug · December 19, 2014, 2:19pm

Think about storing the in memory model to a file and load it on start. We
use kryo for cheap simple in memory models. That will speed up the startup
time by a lot more. If you change the model just delete the snapshot file.
When loading more than 30.000 events this is necessary in my use cases.

Greg_Young1 · December 19, 2014, 2:21pm

The main one we use thats of substantial size is $statistics which you
can limit the number we keep using $maxAge/$maxCount

10,000 events should be very very fast to start up either way (on the
order of < 3 seconds for an in memory model)

Marc_Flerin · December 19, 2014, 2:30pm

Thanks Greg - Maybe it’s something I don’t really need to worry about then. I’ll look into setting the $maxCount for stats.

Thanks Idar - If the startup time ever becomes a problem I’ll revisit your suggestion.

Chris_Ray · December 19, 2014, 2:34pm

Idar is right, if you find that start up times become unacceptable (it takes a lot for that to be the case) then you’ll need to persist your read model. I had a project where startup times grew to over 5 minutes. It was only then that I went to persisted model (startup became instant) and it was really easy to do. I persisted to Sql Server and had a separate process maintain the readmodel, this way multiple processes could look to the same read model regardless of starting/stopping. Of course every case is different. If you have only one process, or don’t care about 100% consistency across multiple processes, catchup-subscription persisted to a file is the way to go.

Marc_Flerin · December 19, 2014, 2:39pm

Just out of interest, how many events did you have when you reached the 5 minutes startup?

Marc_Flerin · December 19, 2014, 2:47pm

Also, did you have your eventstore and website running on the same box?

I’m currently running in azure with eventstore on a different box to the site. I’m sure latency will be a big factor.

Greg_Young1 · December 19, 2014, 2:51pm

Have you benchmarked this?

Chris_Ray · December 19, 2014, 2:53pm

I’ve left that company and I never actually ran the metric but I’ll take an educated guess, ~750 new streams a month * 12 months * ~1200 events per project = 10.8 million events. That’s not counting the $ streams that were ignored.

The first time I hit >5 minutes real quick because I was storing pdf BLOBs directly in my events, probably a dozen ~50KB events in each stream. The sheer amount of data crossing the wire and being deserialized and shuffled around in memory was a huge drag. I reconfigured the system to store hashes of the files in the events, whereas the real file data was offloaded elsewhere. It helped started times for a while, until as time wore on, the # of events grew quite large after about a year. That’s when I went to a persisted read model. It worked perfectly when I left, except when you had to rebuild the read model (that took hours, but could be done separately)

Marc_Flerin · December 19, 2014, 2:54pm

I’ll benchmark over the weekend and report my results.

Chris_Ray · December 19, 2014, 2:55pm

No they were all on different servers, but on the same network (in fact, some may have been on the same physical hardware as it was a VMWare ESX environment and they were all virtual machines). I didn’t handle that infrastructure component.

Idar_Borlaug · December 19, 2014, 2:55pm

Startuptimes are very dependent on what you do in your readmodel. The eventstore is always much faster than your processing. I have projection which read 2.5 mill events in no time and projections which takes around 15-20 minutes with 2.5 mill events.

Marc_Flerin · December 19, 2014, 3:00pm

Thinks Chris. That really puts things into perspective. It’s way more events that I’ll ever have to worry about

Chris_Ray · December 19, 2014, 3:12pm

You’re welcome. By the way, I wrote that $all projection code before I fully understood the power of the built-in $by_category & $by_event_type. Since your original question was regarding NOT having system events come through, I suggest you take a few hours and play around with those and see if you can work up a way to have a single stream with all of the events you need linked from there. It would have satisfied my requirements had I known about it.

Marc_Flerin · December 23, 2014, 4:02pm

Some benchmarks as promised (in azure):

–Test 1–

–Evenstore VM–

Windows VM Running Eventstore

A1 (1 core 1.75 GB memory)

–Website–

Small instance

(1 core 1.75 GB memory)

15:22:47 App startup
15:23:48 Eventstore subsciption subscirbedfrom position 0
27.85 seconds to read events from store
Positon 85827885/85827885; events processed 104528; systemevents Ignored 753
Dispatching Events
25000 of 104528 Processed
50000 of 104528 Processed
75000 of 104528 Processed
100000 of 104528 Processed
172.52 SECONDS APP READY

----Test 2-----

–Evenstore VM–

Windows VM Running Eventstore

A1 (1 core 1.75 GB memory)

–Website–

Large Instance

(4 cores 7 GB memory)

Greg_Young1 · December 23, 2014, 6:33pm

OK and if in your replay you just say x++ to count your events?