EventStore as timeseries + deduplication projection + one more documentation question

Hey! I hope I can get some advice here.

Q#1.

I’m implementing (PoC), which I think is not a typical ‘aggregate’ or ‘DDD’ scenario. Basically I have a log of external API responses, which receives events not that often: e.g. few times during a day, but additionally every time this API is called because of user interaction. events maxAge is a few days here, mostly for debugging purposes, in real scenario an hour will be more than enough. I want to linkTo only events with unique content to another stream, which will log only ‘unique’ API state (api endpoint + api call parameters + response) events (also with reasonable maxAge like few days max). Later I will copy into another stream 2-3 API responses per endpoint per user per day (responses actually don’t change that often), I’m thinking about using projection too.

Is this a good use case for server-side projection? Or better handle this on application level? Or is it even a good use-case for event-sourcing?

I’m just realising, it is more like time series + event log.

For now to me it is just looks simpler to use event-store for both and don’t introduce MQ and dedicated time series database. I’d prefer server-side projections, since it is actually easier to write (~ 20 lines of code?) and maintain, than create whole application infrastructure around that with scalability and fault tolerance etc. etc.

I’m thinking about following scale. 20-100 external API endpoints, ping them few times a day. ~ 15000 users, not much interactions, but they will come in batches, e.g. when user interacts with client app, they will trigger number of external API calls. I do some basic deduplication on application layer, e.g. cache external API call responses for at least a minute, so let’s say ‘current’ streams all together won’t receive more than 15000 events per minute. I’m thinking about keeping the archive streams in event store for a year or two, and then offload them to external archive storage. Number of streams will be ~ (1 current + 1 deduplicate + few archive streams) x (number of external API endpoints) x (number of users), let say approximately 5 * 50 * 15000. App will send to ‘current’ stream, consume from ‘deduplicate’ stream, and from time to time look into archive streams.

Sorry, if I’m missing something obvious, I’m a software engineer, but I don’t consider myself an architect.

Q#2:

I’m struggling to understand what event attributes projection receives actually means, I mean this list of attributes:

  • isJson: true/false
  • data: {}
  • body: {}
  • bodyRaw: string
  • sequenceNumber: integer
  • metadataRaw: {}
  • linkMetadataRaw: string
  • partition: string
  • eventType: string
  • streamId: string

Most of them are quite straightforward, but e.g. what is difference between body and data. bodyRaw - I guess this will be body as string. I guess body will not be present if even it not isJson, is this correct? What is linkMetadataRaw? I’m also receiving metadata and linkMetadata as objects, that’s great, but they are not documented.

I can answer to the second question. bodyRaw is indeed a payload string, regardless if the event content is JSON or not. body is a “deserialised” object when the payload is JSON. metadata is also a deserialised metadata object, and metadataRaw is a raw string. Normally, custom projections should not be used with non-JSON payloads as it becomes hard to work with those events.

linkMetadata and linkMetadataRaw are only available when you consume linked events. The projection will get a resolved event (the one that the link points to) but also the link even metadata as well in those fields.