Azure nodes with attached storage recommended configuration

Hello. We’re in the process of migrating from custom event store (stored directly in Azure blob files) to EventStore hosted on Azure VMs. We have about 3 TB of data and indexes are around 100GB with several billion of events. The typical load isn’t very high (500-1000 msg/sec but could spike up to 2-3 times that).

Right now we’re looking into using L4 due to large local storage for index with attached Premium SSD storage to store data.

Here’re some things we’re looking and wanted to get your opinion on this based on experience and perhaps you can do provide some recommendations as well:

  1. Use more than 4 reader threads due to high latency of attached storage (which is around 2-4ms). 12 seems to work better, but have you found more optimal number?
  2. Right now we’re testing with single 4Tb drive premium SSD. Going to test by striping 8x 512Gb drives with stripe size 64KB. Have you tried striping? With what stripe size? Did it help?
  3. Use file system compression for chunks (not index). Seems to be a good fit since less data to move over network and files are immutable. CPU overhead is negligible. Have you tried this long term?
  4. 3 node cluster seems adequate for HA?
  5. Change default chunk file size to 1GB from 256MB.
  6. Since we use L machine, there’s no read-only cache available for premium SSD. How big of a difference would it really make with ES?
    Does this look good? Bad? Do you have any other suggestions on how to configure VM correctly for working in Azure with attached storage?

Just wanted to clarify. We’re also trying to optimize for high throughput for migration and testing purposes.

Thanks, Slav