Just FYI, we just had a PR merged that significantly changes this area of ES. I believe it is earmarked for 4.2 but we have been running a custom build in production with high load for almost 6 months now.
In summary, when an index merge occurs ES does two things, 1) merges the indexes into a single file using a merge sort algorithm 2) whilst it does this, it skips any records whose data is no longer available. 1) is fine but 2) is reasonably heavyweight (it has to check the relevant chunk of the data is still there)- the optimise index merge flag you mention is an optimisation for that part only (using a bloom filter).
With the aforementioned merged PR, 2) will be completely removed from the index merge process and instead be performed during a node scavenge operation (which is when data is cleaned up now so makes sense to pair them).
This has numerous benefits (you’ll see them more in large clusters):
1) merges are faster (as it now only performs a merge sort between the files)
2) have less impact on the cluster’s performance (as they don’t have to check if the data exists or not)
Both of which is what you want because they occur at a time of ES’s choosing, not yours
Finally, 3) the index entries are scavenged immediately after the data is scavenged, and happen at a time of your choice (and can be stopped at will etc)
I can’t remember what the confit option is for it off the top of my head but it should be simple to find.