I feel I’m going over already discussed ground (but cant find any specifics threads), but I’m performing a scavenge on a cluster where a LOT of streams (containing plenty of data) have been deleted. I perform a scavenge on one of the nodes (just going through one at a time) and although it completes successfully (over 24 hours later)… the amount of drive space freed up is miniscule.
Back history. Imported 69M events from another source into ES. This caused the ES disk space to jump from about 12G to well over 100G. Has since been decided that we don’t have to have those new events in ES (and it was causing issues for us), do each of the new streams (all new data went into new streams) were soft deleted. I had 2 expectations… 1) on the next scavenge ES would free up space used by those streams BEFORE the soft delete mark… 2) I’d get back down to about the 12G level.
The scavenge has reduced the db directory down to about 90G… so technically lower, but not by much. And certainly nothing near the 12G from before the import.
One thing I should note, although the streams were deleted (soft delete) with the mass migrated data, new/fresh data has since been added to those streams. So an individual stream might have tens/hundreds of thousands of events BEFORE the soft delete mark… but only a few thousands after. Unsure if that would make a difference, but wanted to note it here.
Is there any obvious step that I’ve missed? Scavenging would free up disk space fine prior to this large (for us) migration of data.
Any help highly appreciated.