I have an event store holding 321,340, I ran a script to go through the $all stream and simply track all of them, grouping them by event type.
The results are not surprising, the distribution of events is what I expect. Except:
The last position is 28,730,889,440 I don’t understand how position goes, but I don’t get how does this get to 28 billion.
I also added the byte count for both data and metadata and got to 143MB
In terms of storage, we are in 14 GB, with 1.3GB in indexes. I would like to know how to figure what I’m doing to get to these numbers.
Note: This is all after doing a scavenge
Hi,
The position value represents the logical location in the global event log. For 14GB a position of 14,000,000,000
is not unreasonable. However, given you run scavenging, had you not scavenged you would have been closer to the 28,730,889,440
you mentioned. Scavenging does not magically reset the position, so the numbers are not abnormal.
1 Like
Oh, that’s one out of the door, thanks.
What about the space? 14GB feels like a lot for less than 200MB of actual information.
A data directory listing may shed more light. Send a PM if you consider this to be sensitive information.
1 Like
-rw-r--r-- 1 azureuser azureuser 8 Jan 16 2024 truncate.chk
-rw-r--r-- 1 azureuser azureuser 8 Aug 20 21:51 proposal.chk
-rw-r--r-- 1 azureuser azureuser 8 Aug 20 21:51 epoch.chk
drwxr-xr-x 5 root root 64 Sep 5 10:57 ../
-r--r--r-- 1 azureuser azureuser 176996352 Sep 18 12:49 chunk-000000.000001
-r--r--r-- 1 azureuser azureuser 165085184 Sep 18 13:00 chunk-000030.000001
-r--r--r-- 1 azureuser azureuser 195575808 Sep 18 13:00 chunk-000031.000001
-r--r--r-- 1 azureuser azureuser 198344704 Sep 18 13:01 chunk-000032.000001
-r--r--r-- 1 azureuser azureuser 199012352 Sep 18 13:01 chunk-000033.000001
-r--r--r-- 1 azureuser azureuser 199876608 Sep 18 13:01 chunk-000034.000001
-r--r--r-- 1 azureuser azureuser 200478720 Sep 18 13:02 chunk-000035.000001
-r--r--r-- 1 azureuser azureuser 204500992 Sep 18 13:02 chunk-000036.000001
-r--r--r-- 1 azureuser azureuser 202690560 Sep 18 13:03 chunk-000037.000001
-r--r--r-- 1 azureuser azureuser 200261632 Sep 18 13:03 chunk-000038.000001
-r--r--r-- 1 azureuser azureuser 200241152 Sep 18 13:04 chunk-000039.000001
-r--r--r-- 1 azureuser azureuser 200318976 Sep 18 13:04 chunk-000040.000001
-r--r--r-- 1 azureuser azureuser 195268608 Sep 18 13:05 chunk-000041.000001
-r--r--r-- 1 azureuser azureuser 196222976 Sep 18 13:05 chunk-000042.000001
-r--r--r-- 1 azureuser azureuser 199061504 Sep 18 13:06 chunk-000043.000001
-r--r--r-- 1 azureuser azureuser 199663616 Sep 18 13:06 chunk-000044.000001
-r--r--r-- 1 azureuser azureuser 199999488 Sep 18 13:06 chunk-000045.000001
-r--r--r-- 1 azureuser azureuser 184029184 Sep 18 13:07 chunk-000046.000001
-r--r--r-- 1 azureuser azureuser 195985408 Sep 18 13:07 chunk-000047.000001
-r--r--r-- 1 azureuser azureuser 190332928 Sep 18 13:08 chunk-000048.000001
-r--r--r-- 1 azureuser azureuser 193769472 Sep 18 13:08 chunk-000049.000001
-r--r--r-- 1 azureuser azureuser 201768960 Sep 18 13:09 chunk-000050.000001
-r--r--r-- 1 azureuser azureuser 199229440 Sep 18 13:09 chunk-000051.000001
-r--r--r-- 1 azureuser azureuser 198119424 Sep 18 13:10 chunk-000052.000001
-r--r--r-- 1 azureuser azureuser 198844416 Sep 18 13:10 chunk-000053.000001
-r--r--r-- 1 azureuser azureuser 200208384 Sep 18 13:10 chunk-000054.000001
-r--r--r-- 1 azureuser azureuser 196063232 Sep 18 13:10 chunk-000055.000001
-r--r--r-- 1 azureuser azureuser 194068480 Sep 18 13:11 chunk-000056.000001
-r--r--r-- 1 azureuser azureuser 195338240 Sep 18 13:11 chunk-000057.000001
-r--r--r-- 1 azureuser azureuser 185462784 Sep 18 13:11 chunk-000058.000001
-r--r--r-- 1 azureuser azureuser 193101824 Sep 18 13:12 chunk-000059.000001
-r--r--r-- 1 azureuser azureuser 200118272 Sep 18 13:12 chunk-000060.000001
-r--r--r-- 1 azureuser azureuser 197263360 Sep 18 13:12 chunk-000061.000001
-r--r--r-- 1 azureuser azureuser 200388608 Sep 18 13:13 chunk-000062.000001
-r--r--r-- 1 azureuser azureuser 275230720 Sep 18 13:14 chunk-000001.000002
-r--r--r-- 1 azureuser azureuser 270987264 Sep 18 13:14 chunk-000004.000002
-r--r--r-- 1 azureuser azureuser 270094336 Sep 18 13:15 chunk-000011.000002
-r--r--r-- 1 azureuser azureuser 268939264 Sep 18 13:16 chunk-000018.000002
-r--r--r-- 1 azureuser azureuser 192425984 Sep 18 13:17 chunk-000025.000002
-r--r--r-- 1 azureuser azureuser 225837056 Sep 26 15:02 chunk-000029.000002
-r--r--r-- 1 azureuser azureuser 246382592 Sep 26 15:16 chunk-000068.000002
-r--r--r-- 1 azureuser azureuser 243412992 Sep 26 15:16 chunk-000073.000002
-r--r--r-- 1 azureuser azureuser 240906240 Sep 26 15:17 chunk-000078.000002
-r--r--r-- 1 azureuser azureuser 241324032 Sep 26 15:17 chunk-000083.000002
-r--r--r-- 1 azureuser azureuser 194187264 Sep 26 15:17 chunk-000088.000002
-r--r--r-- 1 azureuser azureuser 242769920 Sep 26 15:18 chunk-000095.000002
-r--r--r-- 1 azureuser azureuser 235499520 Oct 9 07:42 chunk-000065.000003
-r--r--r-- 1 azureuser azureuser 271347712 Oct 9 07:43 chunk-000092.000003
-r--r--r-- 1 azureuser azureuser 249479168 Oct 9 07:44 chunk-000100.000003
-r--r--r-- 1 azureuser azureuser 236036096 Oct 10 08:07 chunk-000063.000006
-r--r--r-- 1 azureuser azureuser 219443200 Oct 10 08:07 chunk-000106.000004
drwxr-xr-x 3 azureuser azureuser 4096 Oct 10 08:07 ./
drwxr-xr-x 4 azureuser azureuser 4096 Oct 10 08:08 index/
-rw-r--r-- 1 azureuser azureuser 268439552 Oct 10 12:21 chunk-000109.000000
-rw-r--r-- 1 azureuser azureuser 8 Oct 10 12:21 writer.chk
-rw-r--r-- 1 azureuser azureuser 1961140537 Oct 10 12:21 chaser.chk
index:
drwxr-xr-x 2 azureuser azureuser 72 Jan 16 2024 stream-existence/
-rw-r--r-- 1 azureuser azureuser 686912 Oct 9 07:44 8d574cbb-bc9d-40ff-8ef3-4cd208d11168.bloomfilter
-r--r--r-- 1 azureuser azureuser 63330320 Oct 9 07:44 8d574cbb-bc9d-40ff-8ef3-4cd208d11168
-rw-r--r-- 1 azureuser azureuser 1118336 Oct 9 07:44 0ee3f89c-935e-4085-b098-abb9e1df3661.bloomfilter
-r--r--r-- 1 azureuser azureuser 102212840 Oct 9 07:44 0ee3f89c-935e-4085-b098-abb9e1df3661
-rw-r--r-- 1 azureuser azureuser 9099328 Oct 9 07:45 a72dff91-0dd8-4518-b200-4efc4dab7bcf.bloomfilter
-r--r--r-- 1 azureuser azureuser 822078200 Oct 9 07:45 a72dff91-0dd8-4518-b200-4efc4dab7bcf
-rw-r--r-- 1 azureuser azureuser 193 Oct 9 07:45 indexmap
drwxr-xr-x 3 azureuser azureuser 4096 Oct 10 08:07 ../
drwxr-xr-x 4 azureuser azureuser 4096 Oct 10 08:08 ./
drwxr-xr-x 2 azureuser azureuser 27 Oct 10 08:08 scavenging/
index/stream-existence:
drwxr-xr-x 2 azureuser azureuser 72 Jan 16 2024 ./
drwxr-xr-x 4 azureuser azureuser 4096 Oct 10 08:08 ../
-rw-r--r-- 1 azureuser azureuser 273066752 Oct 10 12:21 streamExistenceFilter.dat
-rw-r--r-- 1 azureuser azureuser 8 Oct 10 12:21 streamExistenceFilter.chk
index/scavenging:
drwxr-xr-x 4 azureuser azureuser 4096 Oct 10 08:08 ../
-rw-r--r-- 1 azureuser azureuser 23478272 Oct 10 08:08 scavenging.db
drwxr-xr-x 2 azureuser azureuser 27 Oct 10 08:08 ./
Thanks a lot
Quick note:
the size of chaser.chk
is strange it should be 8 bytes.
it essentially contains a long
( 8 bytes) at the beginning
( ! don’t delete or change it )
1 Like
28,730,889,440
: does correspond to an expected number given you’re up to chunk-000109.000000
and is a logical number
Kind of complicated to explain how this is calculated
that number
is not the total size of data you appended
it has to do with the total size of data you appended + other stuff from the database
it is not reset to a lower number when scavenging.
while using $all
to calculate the amount of user data in the database , can give an estimation of the total data size :
you need to take into account system events , linkTos ( $>
) when system projections are on
it does not reflect the on disk chunk structure that encapsulate events ( that’s about 100 bytes per event)
it does not reflect the fact when scavenging there are additional structure in the chunk file that are needed to map between the physical location in the file and the logical position in the database log
the size of your indexes suggest you have about 38M events in your database .
Is that correct ?
1 Like
Not that I know of, I did a read of $all without filters and got 321,340 events.
I did the read again and inserted the events in a clean store in a container, and it ended up with 708 MB
I have a few persistent subscriptions, too.
I need to be more precise on the indexes size:
it suggests 38M entries in it, and that should roughly be equal to the number of events in the database
my first concern is the size if the chaser.chk
file
is it the same on other nodes in the same cluster ?
if it’s not a cluster : I would check in the backups
is this a cluster ?
if yes what are the sizes of files on the other nodes ?
how did you do your $all read ?
How many persistent subscriptions ?
It is not a cluster, it is the dev environment, so no backups.
I did the read using the ReadAllAsync method from the .NET client with no filters, from start, forwards.
Around 40 persistent subscriptions, they had a few parked messages, replaying them had no impact in the space used.
Probably the simplest thing to do is to use that copy you made as the new dev environment if you need to reclaim the space .
Without detailed analysis of those files & full log it’s hard to know what happening ( + the size of the chaser.chk file)
Do you scavenge as well on the production cluster ?
1 Like
It got in production only recently, so not much to scavenge there yet.
How can I read ALL the events? I mean, the persistent subscriptions use events, and I don’t see them in my ReadAll iteration.
Are you are excluding system events while reading, perhaps?
Side Q: What version of the server and client SDK are you on?
1 Like
alexey
October 16, 2024, 8:00am
14
If you have persistent subscriptions, you at least have checkpoints to scavenge
1 Like
SDK Version: 23.2.1
Server version: 22.10.2.0
Reading like this:
var all = client.ReadAllAsync(Direction.Forwards, position);
await foreach (var @event in all)
{
...
I made a backup, and loaded it in my machine:
It rebuilt the chaser to be the size we expected.
Then I just deleted the persistent subscriptions, ran a scavenge, and it freed 11.6 GB
I think I’m going to consider an alternative to persistent subscriptions.
Why would deleting the persistent subscriptinos free up so much space?
My understanding was a checkpoint would be written every so often, but as time went on, these dropped off the system (so would be picked up on a scavenge).
Did you delete any Projections feeding your Persistent Subscriptions as well?
1 Like
This is by design. Even if you do not specify a client-side filter, the server will filter out (aka disallow you from reading) the checkpoints and parked messages of persistent subscriptions (to $all
).
1 Like
Steven, I really don’t know, I’ll rebuild them and see what happens.
1 Like