Checkpoint internals

Hi,

We are currently exploring esdb. I think we understand how checkpoints work from the perspective of a persistent subscription for example. But we are wondering how it works from the perspective of the server.

We’ve just appended around 630,000 events and the checkpoint for the last added event was 1,495,199,571. The current checkpoint is 3,208,058,305. We are running all system projections and no user defined projections.

Given the current checkpoint there is a 1:5,000 ratio between the current checkpoint and the number of events. So every event would cause the checkpoint to increase by at least 5,000. Does this make sense? Is this caused by running system projections?

Checkpoint is an unsigned long. Given the ratio this would allow storing 3,618,549,444,923,920 events. This of course is quite alot, but we are wondering why does the checkpoint value increase as fast as it does?

Anyone with some better insights then us on how checkpoints work? :slight_smile:

Cheers,

Peter

It’s basically a byte position in the global log. Each event has some bytes before and after the payload, so we can traverse the log back and forth faster. System projections produce links. Even though links are small, they still use space. If you have all the system projections running, you get three links for each event ($ce, $et, $correlationId), plus two links per new stream ($streams and $streams_by_category). You also get one metadata event for setting the default ACL for each new stream.

You can actually look at it in $all in the UI (https://your-server:2113/web/index.html#/streams/$all). Open it in one browser, and add one event in another browser, you’ll see what happens.

I don’t think having this large number is a concern. We have customers with databases that exceed tens of terabytes of data, and it’s just fine. Of course, it’s not endless, but we don’t recommend to have databases with unlimited size, you’d need to get rid of some events anyway (archive, delete, etc).

1 Like