Projections - Idempotency and event order

albert.ortizl · November 3, 2024, 6:04am

Hi! I have a question about projections (client ones).

I would like to use a projection to keep the balance of debits and credits of a bank account. I need to ensure that this balance is accurate because I will perform business checks reading this model.

I will have an stream per account, like : account-{id}

I have planned to use:

fromCategory("account")
            .foreachStream()
            .when({ // the code

My questions:

Are projections idempotent? I mean, are they ensuring that an event is never processed twice?
Are projections atomic? I mean, are they ensuring that one event is processed at time for projection, so no concurrency errors?
Are projections ensuring order of events? Meaning, there are no race conditions by which I can receive out-of-order events

I would like to understand how they behave since I din’t find this in your docs and testing edge cases it is kind of difficult.

Thanks in advance!

yves.lorphelin · November 3, 2024, 4:00pm

In short yes , idempotency , 1 event at time and ordered

I’d suggest you look at this :

github.com

EventStore/EventStore/blob/master/src/EventStore.Projections.Core.Javascript.Tests/Specs/account-balancer.js

options({
	biState: true
});

fromCategory("transaction")
	.partitionBy(function(e) {
		if (e.eventType === "header") return "description";
		if (e.body.accountId.startsWith("ESDBB"))
			return e.body.accountId;
		return undefined;
	})
	.when({
		$init: function() {
			return { balance: 0 };
		},
		$initShared: function() {
			return {
				numberOfAccounts: 0,
				totalBalance: 0
			};

This file has been truncated. show original

( the last one about ordering is relaxed when using fromStreams() see here )

albert.ortizl · November 4, 2024, 11:05am

Cool! Thanks for your quick answer. It would be great if you could add these capabilities in the docs, maybe a new section, since it is not clear checking the docs.

albert.ortizl · November 5, 2024, 5:42pm

Hi again, I have a follow up question, as I already commented before I will have a stream per account, like : account-{id}

And:

fromCategory("account")
            .foreachStream()
            .when({ // the code

This code will lead to millions of partitions, because I have an stream per account and millions of accounts in my system, is that ok in terms of performance? Do you recommend to have a different approach for this kind of read model?

yves.lorphelin · November 5, 2024, 7:53pm

EventStoreDB is capable of storing billions of streams & events
Yes

albert.ortizl · November 5, 2024, 8:07pm

My question was more about having millions of partitions within the projection, is that also ok in terms of performance or shall I create my own projection in a dedicated datastore? Thanks in advance!

albert.ortizl · November 6, 2024, 5:13pm

So, really high cardinality.

yves.lorphelin · November 6, 2024, 5:56pm

on the number of partition :
it shouldn’t be a problem, think of it as each partition having it’s own state ,
is that also ok in terms of performance or shall I create my own projection in a dedicated datastore?
You should perform some realistic perf tests.
User projections do create overhead in term of CPU and additional events needs to appended to the database ( so there is a write amplification that uses IOPS and disk space )
Scavenging also needs to happen ( to deleted old checkpoints used by the projection engine )

yves.reynhout · November 6, 2024, 6:06pm

I think the more fundamental question here is: what do you intend to do with all those partitions and how will you learn about them (by convention perhaps)?

albert.ortizl · November 6, 2024, 6:16pm

They would be the read model, keeping the state and balance for each account … I will perform some tests as Yves was suggesting. Thanks for your expertise and quick answers