Is EventStoreDB overkill for hobby/side projects in terms of cost?

visuell · November 12, 2021, 7:34am

Hello, I hope I am not in the wrong place with this question. If so, I would like to apologize in advance. And by the way, you have a fantastic product. For this reason I would like to use it for a new project, but I am concerned about my solution architecture and the resulting costs.

The project will have about 10,000 users. In a document database this would represent 10,000 documents (aka streams in ESDB). In addition, each user has about 5000 subcollection documents --> 10,000 users * 5000 documents = 50,000,000 streams.

One stream per document:
my-app/
├─ User-Streams/
│ ├─ user-1
│ ├─ …
│ ├─ user-10000
├─ Document-Streams/
│ ├─ document-1
│ ├─ …
│ ├─ document-50000000

If I decided to have a single main document stream per user to reduce the number of total streams, I would have to iterate all the individual events when changes are made. Since documents are edited on a regular basis, I would prefer to have one stream aggregate per document.

Main stream for documents:
my-app/
├─ User-Streams/
│ ├─ user-1
│ ├─ …
│ ├─ user-10000
├─ Document-Streams/
│ ├─ documents-of-user-1
│ ├─ …
│ ├─ documents-of-user-10000

The projections would be stored in a document database which would result in 10.000 UserCollections with 5000 SubDocuments. So the read side would not be a problem.

I know that this is a very individual question, but I hope that someone can give me some suggestions.

For this project I have a private vhost cluster (3 nodes) with with 4 cores 16 GB Ram and 1TB SSD.
However, according to the official “Instance Sizing Guide”, my needs are equivalent to an M128 Production Instance to handle this amount of streams.

So i am asking me: Is my small project without much revenue thus the wrong use case for ESDB?

I know that this is a very individual question, but I hope that someone can give me some suggestions.

alexey.zimarev · November 12, 2021, 3:08pm

Sorry but your question starts as technical, then it goes to modelling without providing much of a context about the domain itself.

Avoiding the details, if something can be represented as 10K docs in a document database, it could be represented as 10K streams. But, regardless of what “subcollection document” means (I am not familiar with the term), it seems like a document on its own, so summing up 10K plus all the “subcollection documents” you get to the same 50M objects in the database.

As it’s unclear what would the average stream size, it’s hard to guess the hardware requirements for such a database, but I won’t expect it to exceed something you’d need to have to run MongoDB for a similar database. Of course, the disk needs to be larger as updates here won’t be state replacements, but appends.

I would also congratulate you as not every developer can call a system with 10K active users a “hobby project”

fredrik · November 12, 2021, 3:08pm

Streams are very cheap, there is (in my understanding) no storage reason to join streams or having fewer streams for any given set of events.

jageall · November 12, 2021, 4:16pm

you are misreading the sizing guide too:

The Working Set is the number of streams being read from and written to concurrently. It’s essential to recognise that writing one million events into one million streams is a very different scenario than writing one million events into a single stream

the 62M streams being accessed concurrently means that all 50M streams in your project would need to be accessed concurrently to require that size of instance