Space usage I can't understand

I have an event store holding 321,340, I ran a script to go through the $all stream and simply track all of them, grouping them by event type.

The results are not surprising, the distribution of events is what I expect. Except:

The last position is 28,730,889,440 I don’t understand how position goes, but I don’t get how does this get to 28 billion.

I also added the byte count for both data and metadata and got to 143MB

In terms of storage, we are in 14 GB, with 1.3GB in indexes. I would like to know how to figure what I’m doing to get to these numbers.

Note: This is all after doing a scavenge

Hi,

The position value represents the logical location in the global event log. For 14GB a position of 14,000,000,000 is not unreasonable. However, given you run scavenging, had you not scavenged you would have been closer to the 28,730,889,440 you mentioned. Scavenging does not magically reset the position, so the numbers are not abnormal.

1 Like

Oh, that’s one out of the door, thanks.

What about the space? 14GB feels like a lot for less than 200MB of actual information.

A data directory listing may shed more light. Send a PM if you consider this to be sensitive information.

1 Like
-rw-r--r-- 1 azureuser azureuser          8 Jan 16  2024 truncate.chk
-rw-r--r-- 1 azureuser azureuser          8 Aug 20 21:51 proposal.chk
-rw-r--r-- 1 azureuser azureuser          8 Aug 20 21:51 epoch.chk
drwxr-xr-x 5 root      root              64 Sep  5 10:57 ../
-r--r--r-- 1 azureuser azureuser  176996352 Sep 18 12:49 chunk-000000.000001
-r--r--r-- 1 azureuser azureuser  165085184 Sep 18 13:00 chunk-000030.000001
-r--r--r-- 1 azureuser azureuser  195575808 Sep 18 13:00 chunk-000031.000001
-r--r--r-- 1 azureuser azureuser  198344704 Sep 18 13:01 chunk-000032.000001
-r--r--r-- 1 azureuser azureuser  199012352 Sep 18 13:01 chunk-000033.000001
-r--r--r-- 1 azureuser azureuser  199876608 Sep 18 13:01 chunk-000034.000001
-r--r--r-- 1 azureuser azureuser  200478720 Sep 18 13:02 chunk-000035.000001
-r--r--r-- 1 azureuser azureuser  204500992 Sep 18 13:02 chunk-000036.000001
-r--r--r-- 1 azureuser azureuser  202690560 Sep 18 13:03 chunk-000037.000001
-r--r--r-- 1 azureuser azureuser  200261632 Sep 18 13:03 chunk-000038.000001
-r--r--r-- 1 azureuser azureuser  200241152 Sep 18 13:04 chunk-000039.000001
-r--r--r-- 1 azureuser azureuser  200318976 Sep 18 13:04 chunk-000040.000001
-r--r--r-- 1 azureuser azureuser  195268608 Sep 18 13:05 chunk-000041.000001
-r--r--r-- 1 azureuser azureuser  196222976 Sep 18 13:05 chunk-000042.000001
-r--r--r-- 1 azureuser azureuser  199061504 Sep 18 13:06 chunk-000043.000001
-r--r--r-- 1 azureuser azureuser  199663616 Sep 18 13:06 chunk-000044.000001
-r--r--r-- 1 azureuser azureuser  199999488 Sep 18 13:06 chunk-000045.000001
-r--r--r-- 1 azureuser azureuser  184029184 Sep 18 13:07 chunk-000046.000001
-r--r--r-- 1 azureuser azureuser  195985408 Sep 18 13:07 chunk-000047.000001
-r--r--r-- 1 azureuser azureuser  190332928 Sep 18 13:08 chunk-000048.000001
-r--r--r-- 1 azureuser azureuser  193769472 Sep 18 13:08 chunk-000049.000001
-r--r--r-- 1 azureuser azureuser  201768960 Sep 18 13:09 chunk-000050.000001
-r--r--r-- 1 azureuser azureuser  199229440 Sep 18 13:09 chunk-000051.000001
-r--r--r-- 1 azureuser azureuser  198119424 Sep 18 13:10 chunk-000052.000001
-r--r--r-- 1 azureuser azureuser  198844416 Sep 18 13:10 chunk-000053.000001
-r--r--r-- 1 azureuser azureuser  200208384 Sep 18 13:10 chunk-000054.000001
-r--r--r-- 1 azureuser azureuser  196063232 Sep 18 13:10 chunk-000055.000001
-r--r--r-- 1 azureuser azureuser  194068480 Sep 18 13:11 chunk-000056.000001
-r--r--r-- 1 azureuser azureuser  195338240 Sep 18 13:11 chunk-000057.000001
-r--r--r-- 1 azureuser azureuser  185462784 Sep 18 13:11 chunk-000058.000001
-r--r--r-- 1 azureuser azureuser  193101824 Sep 18 13:12 chunk-000059.000001
-r--r--r-- 1 azureuser azureuser  200118272 Sep 18 13:12 chunk-000060.000001
-r--r--r-- 1 azureuser azureuser  197263360 Sep 18 13:12 chunk-000061.000001
-r--r--r-- 1 azureuser azureuser  200388608 Sep 18 13:13 chunk-000062.000001
-r--r--r-- 1 azureuser azureuser  275230720 Sep 18 13:14 chunk-000001.000002
-r--r--r-- 1 azureuser azureuser  270987264 Sep 18 13:14 chunk-000004.000002
-r--r--r-- 1 azureuser azureuser  270094336 Sep 18 13:15 chunk-000011.000002
-r--r--r-- 1 azureuser azureuser  268939264 Sep 18 13:16 chunk-000018.000002
-r--r--r-- 1 azureuser azureuser  192425984 Sep 18 13:17 chunk-000025.000002
-r--r--r-- 1 azureuser azureuser  225837056 Sep 26 15:02 chunk-000029.000002
-r--r--r-- 1 azureuser azureuser  246382592 Sep 26 15:16 chunk-000068.000002
-r--r--r-- 1 azureuser azureuser  243412992 Sep 26 15:16 chunk-000073.000002
-r--r--r-- 1 azureuser azureuser  240906240 Sep 26 15:17 chunk-000078.000002
-r--r--r-- 1 azureuser azureuser  241324032 Sep 26 15:17 chunk-000083.000002
-r--r--r-- 1 azureuser azureuser  194187264 Sep 26 15:17 chunk-000088.000002
-r--r--r-- 1 azureuser azureuser  242769920 Sep 26 15:18 chunk-000095.000002
-r--r--r-- 1 azureuser azureuser  235499520 Oct  9 07:42 chunk-000065.000003
-r--r--r-- 1 azureuser azureuser  271347712 Oct  9 07:43 chunk-000092.000003
-r--r--r-- 1 azureuser azureuser  249479168 Oct  9 07:44 chunk-000100.000003
-r--r--r-- 1 azureuser azureuser  236036096 Oct 10 08:07 chunk-000063.000006
-r--r--r-- 1 azureuser azureuser  219443200 Oct 10 08:07 chunk-000106.000004
drwxr-xr-x 3 azureuser azureuser       4096 Oct 10 08:07 ./
drwxr-xr-x 4 azureuser azureuser       4096 Oct 10 08:08 index/
-rw-r--r-- 1 azureuser azureuser  268439552 Oct 10 12:21 chunk-000109.000000
-rw-r--r-- 1 azureuser azureuser          8 Oct 10 12:21 writer.chk
-rw-r--r-- 1 azureuser azureuser 1961140537 Oct 10 12:21 chaser.chk

index:
drwxr-xr-x 2 azureuser azureuser        72 Jan 16  2024 stream-existence/
-rw-r--r-- 1 azureuser azureuser    686912 Oct  9 07:44 8d574cbb-bc9d-40ff-8ef3-4cd208d11168.bloomfilter
-r--r--r-- 1 azureuser azureuser  63330320 Oct  9 07:44 8d574cbb-bc9d-40ff-8ef3-4cd208d11168
-rw-r--r-- 1 azureuser azureuser   1118336 Oct  9 07:44 0ee3f89c-935e-4085-b098-abb9e1df3661.bloomfilter
-r--r--r-- 1 azureuser azureuser 102212840 Oct  9 07:44 0ee3f89c-935e-4085-b098-abb9e1df3661
-rw-r--r-- 1 azureuser azureuser   9099328 Oct  9 07:45 a72dff91-0dd8-4518-b200-4efc4dab7bcf.bloomfilter
-r--r--r-- 1 azureuser azureuser 822078200 Oct  9 07:45 a72dff91-0dd8-4518-b200-4efc4dab7bcf
-rw-r--r-- 1 azureuser azureuser       193 Oct  9 07:45 indexmap
drwxr-xr-x 3 azureuser azureuser      4096 Oct 10 08:07 ../
drwxr-xr-x 4 azureuser azureuser      4096 Oct 10 08:08 ./
drwxr-xr-x 2 azureuser azureuser        27 Oct 10 08:08 scavenging/

index/stream-existence:
drwxr-xr-x 2 azureuser azureuser        72 Jan 16  2024 ./
drwxr-xr-x 4 azureuser azureuser      4096 Oct 10 08:08 ../
-rw-r--r-- 1 azureuser azureuser 273066752 Oct 10 12:21 streamExistenceFilter.dat
-rw-r--r-- 1 azureuser azureuser         8 Oct 10 12:21 streamExistenceFilter.chk

index/scavenging:
drwxr-xr-x 4 azureuser azureuser     4096 Oct 10 08:08 ../
-rw-r--r-- 1 azureuser azureuser 23478272 Oct 10 08:08 scavenging.db
drwxr-xr-x 2 azureuser azureuser       27 Oct 10 08:08 ./

Thanks a lot

Quick note:
the size of chaser.chk is strange it should be 8 bytes.
it essentially contains a long ( 8 bytes) at the beginning

( ! don’t delete or change it )

1 Like

28,730,889,440 : does correspond to an expected number given you’re up to chunk-000109.000000 and is a logical number

  • Kind of complicated to explain how this is calculated

  • that number

    • is not the total size of data you appended
    • it has to do with the total size of data you appended + other stuff from the database
    • it is not reset to a lower number when scavenging.
  • while using $all to calculate the amount of user data in the database , can give an estimation of the total data size :

    • you need to take into account system events , linkTos ( $>) when system projections are on
    • it does not reflect the on disk chunk structure that encapsulate events ( that’s about 100 bytes per event)
    • it does not reflect the fact when scavenging there are additional structure in the chunk file that are needed to map between the physical location in the file and the logical position in the database log

the size of your indexes suggest you have about 38M events in your database .
Is that correct ?

1 Like

Not that I know of, I did a read of $all without filters and got 321,340 events.

I did the read again and inserted the events in a clean store in a container, and it ended up with 708 MB

I have a few persistent subscriptions, too.

I need to be more precise on the indexes size:

  • it suggests 38M entries in it, and that should roughly be equal to the number of events in the database

  • my first concern is the size if the chaser.chk file

    • is it the same on other nodes in the same cluster ?
    • if it’s not a cluster : I would check in the backups
  • is this a cluster ?

    • if yes what are the sizes of files on the other nodes ?
  • how did you do your $all read ?

  • How many persistent subscriptions ?

    • no parked messages ?

It is not a cluster, it is the dev environment, so no backups.

I did the read using the ReadAllAsync method from the .NET client with no filters, from start, forwards.

Around 40 persistent subscriptions, they had a few parked messages, replaying them had no impact in the space used.

Probably the simplest thing to do is to use that copy you made as the new dev environment if you need to reclaim the space .

Without detailed analysis of those files & full log it’s hard to know what happening ( + the size of the chaser.chk file)

Do you scavenge as well on the production cluster ?

1 Like

It got in production only recently, so not much to scavenge there yet.

How can I read ALL the events? I mean, the persistent subscriptions use events, and I don’t see them in my ReadAll iteration.

Are you are excluding system events while reading, perhaps?

Side Q: What version of the server and client SDK are you on?

1 Like

If you have persistent subscriptions, you at least have checkpoints to scavenge :slight_smile:

1 Like

SDK Version: 23.2.1
Server version: 22.10.2.0

Reading like this:

var all = client.ReadAllAsync(Direction.Forwards, position);
await foreach (var @event in all)
{ 
...

I made a backup, and loaded it in my machine:

It rebuilt the chaser to be the size we expected.

Then I just deleted the persistent subscriptions, ran a scavenge, and it freed 11.6 GB

I think I’m going to consider an alternative to persistent subscriptions.

Why would deleting the persistent subscriptinos free up so much space?
My understanding was a checkpoint would be written every so often, but as time went on, these dropped off the system (so would be picked up on a scavenge).
Did you delete any Projections feeding your Persistent Subscriptions as well?

1 Like

This is by design. Even if you do not specify a client-side filter, the server will filter out (aka disallow you from reading) the checkpoints and parked messages of persistent subscriptions (to $all).

1 Like

Steven, I really don’t know, I’ll rebuild them and see what happens.

1 Like