Thank you. I am in the lucky case where my all of streams contain just user - products associations, so I can segment this data in any way and not loose meta data; user characteristics and products characteristics are stored elsewhere.
May I just reiterate to confirm I understood what you explained:
-
set up a different instance of ES, read from my streams up until 3 months ago (minus say a 1 day buffer - see last step) and publish all to this new instance; this will be the backup. I will have to live with the events’ time-stamps being different in this new instance;
-
set up max age on my current live instance to 3 months and wait for the db folder to shrink down;
-
set up max age back to ‘forever’, otherwise I will keep loosing the older than max-age data each day? (where I am assuming max age works as a sliding window; this is why I was thinking about the 1 day buffer in the first step).
About just moving the files, seems I am also lucky in that regard as they seem to be done by days. I’m wishfully thinking I could just move them and everything would still work. Sure this could work fine provided ES does not crash on next reboot. I presume it checks for some consistency between the data in index, the chk files, and the chunk-. files, I don’t know what the internals are. Or maybe die catastrophically if I try to retrieve an event whose chunk is gone etc.