Data integrity on Azure

dunkymole · May 3, 2016, 4:02pm

Hi,

We plan to deploy to Azure (yes, yes, I know, there be demons - nobody ever said corporations make sane decisions). We’ll run as an HA cluster (within a 3 zone fault domain). The data disks will be geo-replicated so we can at least save data in the event of a data-center wide failure. We’ll stripe the disks if we have to in order to meet performance goals.

I am wondering 2 things:

Is there any way to detect corruptions early/proactively? I think I am right in saying that the chaser stuff you have would detect a corruption on write? But I am wondering what happens if a sector of the disk goes bad after this. I don’t want to be in the position of only discovering errors as we scan over that part of the log as part of day to day activity. Is this a valid concern? Would perhaps regularly reintroducing a clean 4th eventstore node to the cluster be a way to achieve this? Or do we think that the fact that writes happen across 3 azure storage nodes (both locally and remotely) represent a strong enough model?
Have striped, geo-replicated disks themselves ever proven to be the source of any issues? Seems to me that the ordering of replicated writes must be kept pretty strong across the constituent disks in the array, so I am worried about consistency post-fail-over. As the disks are knitted together locally from the point of view of the server, the replication services can’t know anything of their relationship which seems kind of flaky. Hopefully I am wrong but I would appreciate any relevant knowledge/perspectives on this.

Cheers

Greg_Young1 · May 3, 2016, 5:38pm

"1. Is there any way to detect corruptions early/proactively? I think
I am right in saying that the chaser stuff you have would detect a
corruption on write? But I am wondering what happens if a sector of
the disk goes bad after this. I don't want to be in the position of
only discovering errors as we scan over that part of the log as part
of day to day activity. Is this a valid concern? Would perhaps
regularly reintroducing a clean 4th eventstore node to the cluster be
a way to achieve this? Or do we think that the fact that writes happen
across 3 azure storage nodes (both locally and remotely) represent a
strong enough model?"

Azure storage is already writing it to 3 places. I would expect it to
be highly unlikely that all 3 would fail. Even if they did on a 3 node
cluster you are doing this 3 times so really have 9 copies.

jen20 · May 3, 2016, 10:11pm

Azure attached disks have themselves proven to be the source of major issues in the past - notably the November 2014 global storage outage which lost fsync’d data (apparently truncating to page boundaries if my memory serves me correctly). If you are in three fault domains (finally possible), why not run on local SSDs and asynchronously replicate to another region?

Greg_Young1 · May 3, 2016, 10:13pm

I believe that was when we were in Vilnius 2014

dunkymole · May 4, 2016, 8:45am

Yeah. Good question. What replication technology do you suggest??