Several questions regarding Eventstore projections

I’m trying out projections and have several questions regarding their proper usage:

  1. What’s the recommended size of a projection state (roughly range up to <100Kb/<1Mb…).

  2. Somewhere on the mailing list there was a thread (https://groups.google.com/forum/?fromgroups=#!searchin/event-store/millions/event-store/w98t-E8roFI/Cvpugv82Jq0J) asking about the viability of millions of projections. Considering a domain of banking and a projection which shows aggregated customer data for each customer, would I be creating a separate projection instance for each customer dynamically, e.g.

fromStream(‘customer-1’).when(…) queried as projections/customer-1/state, where the projection itself is somehow (?) created on the fly when a new CustomerCreated event is registered.

or should I use partitions:

fromAll().partitionBy(function(e) { return e.body.customerId }).when(…) queried as projections/customer/state?partition=1

  1. Related to the previous question, having the following events
  • CustomerCreated { customerId }
  • AccountOpened { accountId, holders: [set of customerId] }

What would be the simplest way to create a CustomerAccounts projection? Is it

fromCategory(‘account’).when({AccountOpened: function(s, e) {
for (var holder in e.body.holders) {
emit(‘customer-’ + holder, ‘CustomerAccountOpened’, { customerId: holder, accountId: e.body.accountId });
}
})
fromCategory(‘customer’).partitionBy(function(e) {
return e.customerId
}).when({CustomerCreated: function(s, e) {
s.id = e.body.customerId
},
CustomerAccountOpened: function(s, e) {
s.accounts.push(e.accountId)
})

The problem with the above is that every event which doesn’t contain a customerId needs to be translated with an additional projection. If there are lots of such events it’s probably simpler to hold the projection in a document store and project events by hands instead of creating eventstore projections which only emit integrating events. Am I missing anything here?

Thanks.

Hitting first two questions quickly, will respond to last later… There is quite a bit to say for it …,

  1. What’s the recommended size of a projection state (roughly range up to <100Kb/<1Mb…).

Generally small is best. Projection state should not generally be very big. If it is something is probably wrong.

  1. Somewhere on the mailing list there was a thread (https://groups.google.com/forum/?fromgroups=#!searchin/event-store/millions/event-store/w98t-E8roFI/Cvpugv82Jq0J) asking about the viability of millions of projections. Considering a domain of banking and a projection which shows aggregated customer data for each customer, would I be creating a separate projection instance for each customer dynamically, e.g.

fromStream(‘customer-1’).when(…) queried as projections/customer-1/state, where the projection itself is somehow (?) created on the fly when a new CustomerCreated event is registered.

Millions is 100% expected but you would do a foreach etc for it (that says to run this projection on every stream in this category as an example that you used later).

OK for the last question… Can you explain more what you want to get? Are you just trying to get a list of customer accounts? if so then …

“If there are lots of such events it’s probably simpler to hold the projection in a document store and project events by hands instead of creating eventstore projections”

yes this is probably correct. I left off “which only emit integrating events.” since thats not all they can do and I don’t understand the meaning. The JS projections are good for some types of problems. A document store is good for some types of problems. A graphdb is good for some types of problems. I would not limit yourself to only picking one :slight_smile:

fromCategory(‘account’).foreachStream()

and

fromCategory(‘customer’).partitionBy(function(e) {

return e.customerId

})

seem to be essentially identical no? (both are paritioning there state in the same way, one per customer).

Previous message got marked as deleted somehow

yes this is probably correct. I left off “which only emit integrating events.” since thats not all they can do and I don’t understand the meaning.

I had only that specific class of problems in mind (when you need to artificially transform an event with one projection in order to process it with another one).

fromCategory(‘account’).foreachStream()

and

fromCategory(‘customer’).partitionBy(function(e) {

return e.customerId

})

seem to be essentially identical no? (both are paritioning there state in the same way, one per customer).

It seems I’ve misunderstood the foreachStream construct. Given that an event has been posted to a stream ‘account-1’, how am I supposed to be querying the state partitions of a persistent projection named Accounts which looks like

fromCategory(‘account’).foreachStream().whenAny(function(s, e) { if (s.events == undefined) { s.count = 0 } s.count++});

?

It seems I’ve misunderstood the foreachStream construct. Given that an event has been posted to a stream ‘account-1’, how am I supposed to be querying the state partitions of a persistent projection named Accounts which looks like

fromCategory(‘account’).foreachStream().whenAny(function(s, e) { if (s.events == undefined) { s.count = 0 } s.count++});

the state is exposed via

/streams/{projection-name}-{stream}-state IIRC don’t have it in front of me at this second. eg /streams/yourname-account-0001-state I will verify for you.

There are some changes coming in this area though … Some things have been added. Results of states is now actually a stream. There is also a transform method for filtering/transforming results.

As I mentioned in another email. Projections is not 100% done at this point and should be considered experimental. The Event Store just as an Event Store is reaching a point of stability.

Projections is not 100% done at this point and should be considered experimental.

Any ETA here? :wink:

Cheers,

Alexey.

Coming :slight_smile: its not that they are broken etc it’s mostly a need of documentation and guidance