Appending an event to multiple streams

decima · September 9, 2022, 8:06pm

I saw somewhere that it’s improper to append an event into multiple streams. I’m working on a system that (among other things) maps medical devices to patients. When the service does this, it emits a “DEVICE_MAPPED_TO_PATIENT” event which contains the patient and device IDs. What’s the best way to link the patient and device streams? Should I just append the event to both of them? Should I have a “mappings” stream that projects a “MAPPED_TO_DEVICE” event into the patient stream and a “MAPPED_TO_PATIENT” event into the device stream? Some other thing I’m not thinking of? Mappings are many devices to one patient and can be unmapped / remapped.

I think my broader confusion is how one maps the concept of a foreign key into event world.

yves.lorphelin · September 11, 2022, 9:02am

( the following just to make sure we share some common understanding )

appending the same event to multiple different instance streams is indeed considered improper
instance streams means: the stream reprensenting an entity and the transactional ( as in transaction & optimistics locks ) boundary in the database .
entity means : a ‘thing’ or a process
it is OK though to have derived streams containing those events; think the category, event type, all streams in the case of EventStoreDB; because those are meant to ease consumption by reactive components ( read- models , process manager )

It is considered “improper” because it breaks the fundamental rule of data modelling : a piece of information is owned & managed by a single concept. If you have the same piece of information managed in multiple streams as you described , which one is the source of truth?

With those preleminaries in mind:
It seem to me you have a missing concept in your stream design : something about “devices that are used by patients”
What kind of devices are we talking about, what does mapped mean ?

are those implants, meaning only 1 patient & 1 device for the lifetime of the device ?
are those lended or used devices for a limited amount of time, let’s say a blood pressure device taken home
are those devices used for a very limited amount of time, let’s say some medical equipment used during surgery ?
is it all of the above ?

So the main questions are:

why does this device mapped to patient needs to be tracked
are there one or different processes to map & track ?

It feels to me that the device mapped to patient is a process , thus an entity with a lifecylce , and each instance of that process has its own entity stream, where the information about what device & what patient is owned by that process / stream .
Moreover, the process could be different for different type of devices used in different cases.

For reporting / ui , I.e. read models or reactive purposes, the patient - device - map are projections of multiple streams : the patient, device & “process of mapping a device to a patient”

What I’m hinting at is: I feel there is a deeper model you need to explore

yves.lorphelin · September 11, 2022, 10:11am

Side note: Would love to have more details about the domain you’re working in.

I have some more questions

what if a device is mapped to a patient whose identity is uknown ?
what if a device is mapped to a patient whose identity is known after the facts ?
what if a device is mapped to a patient whose identity is known during the facts ?
what if the devices is unknown in the system.

You triggered some thoughts @decima: https://github.com/ylorph/The-Inevitable-Event-Centric-Book/issues/57

alexey.zimarev · September 13, 2022, 3:21pm

I’d ask a question “who controls this action”. From the question, a “service” emits an event, but what entity it operates on? At least one piece of state should be used to control the decision flow. It could be both, when one patient cannot have two devices of the same type, and a device cannot be mapped to more than one patient. I’d check what happens in real life, how these checks are performed, and model it.

decima · September 16, 2022, 1:47am

Haha I love causing a stir! Sorry for the delayed reply; I’m realizing it was unwise to post a question right before traveling (still am through October, so will continue to be spotty). This is all incredibly useful! I really appreciate you squaring off some of my internal definitions.

To answer some of these questions though, we make a variety of tools that help patients stay adherent to their medication, and track how well they’re doing. There’s a ton of apps that do this kind of thing, but we’re sort of unique in the space for using physical devices. Event sourcing came up as a possible solution to ease the transition away from a monolithic architecture towards microservices. The transition is still in experimental stages, but so far we’re really impressed by the kinds of things it seems to be able to enable!
So in rough order:

A device will only have 1 patient, but a patient may have multiple devices (rare, but some folks do it)
Most of our patient base have long-term treatment plans, so they use the devices for a while (months to years), and then send them back for decommissioning, generally when the drug is no longer indicated. From a data perspective, the device is not reused.
The device collects adherence data, which needs to be associated back to that patient.

For the follow up questions, because of the logistics of certifying devices, and getting them sent to the patient, the physical process enforces a strict order of device data populated -> patient data populated -> mapping -> shipping / activation. On the data side, we operate with the assumption that that’s always true, but I have considered weird scenarios where it breaks down (e.g. I’ve been lobbying for us to work with safe injection sites where patients would necessarily be anonymous). Building for those cases isn’t getting dev time atm tho.

The solution I’ve got right now which seems to work well is that there are 3 separate streams:

patient-$PATIENT_ID
device-$DEVICE_ID
patient_device_mapping-$DEVICE_ID

I’ve then built a projection (in Go, connected to a persistent subscription to patient_device_mapping event type stream) that pushes a LinkTo of the mapping into the patient and device streams. The source of truth is still the patient_device_mapping stream, but from the perspective of a projection reading the streams, it’s obvious what’s going on. With the exception of the LinkTo events, these are each instance streams.

Post-mapping, our patient events actually include the device ID in the metadata (this predates experiments with ES), so I’ve also experimented with a version that skips the LinkTo and just watches for the change in metadata while reading the patient stream. In the lifecycle of a patient / device, mapping events don’t happen often (for a device, it’s literally twice: mapping and unmapping), so writing 2 extra events once in a while doesn’t feel like a high cost, and I prefer the explicitness of “this thing happened” showing up as an event in the stream, rather than promoting metadata to be part of the data model.

After talking this out, I suspect the “most correct” version of this is to leave the patient and device streams alone, and have patient_device_mapping events trigger the generation of a new derived stream patient_device that includes everything in the patient, device and mapping streams. I’ll play with that, probably this weekend, and see how I like it.

Thank you so much for all the help!