Restart failed — Record too large

Tomas_Jansson · January 22, 2018, 12:27pm

I totally understand that this can happen to any database, but what is the recommended way to deal with this when it happens with event store? That it is pretty easy to handle doesn’t give any good pointer to how I should handle it. Are there any recommended way?

Greg_Young1 · January 22, 2018, 12:48pm

We have a tool we use that perhaps we should publish.

Basically the process is this. Figure out where the last good record is. Set the writer checkpoint to the end of that record and make sure other checkpoints are there or lower.

This works in the case where there is a semi-written record etc. If there is true corruption it won’t work (eg if the disk actually just has bad sectors etc)

Thinking about it, it might be useful to add a --unsafe-recover option etc to eventstore itself that does this process (with asking for Y/N) or to include the utility we already have. The information would still be highly technical asking questions like “found partially written record @ do you want to truncate”

Peter_Hageus1 · January 22, 2018, 2:50pm

That would be good, perhaps include ut in the commercial package? Would save a couple of hours in an emergency.

Our problems were amplified, because the eventstore ran for several days before the problems were visible (skip-db-verify, dont ask… ), so we had a lot of data lost using truncate. have since learned though…

/Peter

Poule_Dodue · January 22, 2018, 7:05pm

so in this case only 1 record would be lost right? would it be possible for the tool to log the partial in a file to see the content of the partial event?

Poule_Dodue · January 24, 2018, 7:28pm

Peter, what do you do differently now that you are aware of this problem?

Peter_Hageus · January 24, 2018, 7:51pm

We don’t set skip-db-verify, as it can hide problems. In our case we ran two more weeks and accepted new data, before realising the store was corrupt… Resetting a projection exposed the problems. (The reason for enabling it was slow starts, but EventStore has been optimised for this later I think.)

I also have checks in place for too big events (I think eventstore also guards against them now, this was 2-3 years ago).

Where possible, windows write-cache is also disabled. (should probably be in big red letters in the install instructions, took a while before I realised the implications of this)

/Peter

Poule_Dodue · January 24, 2018, 8:24pm

is there a way to verify the integrity of data while the db is in use?

Greg_Young1 · January 25, 2018, 9:58am

verify runs while the node is running (md5 checksum calculations)

Poule_Dodue · January 25, 2018, 5:05pm

so how come Peter ran the store for weeks without detecting the corruption? ^^

Peter_Hageus · January 25, 2018, 5:13pm

Cause I had disabled the check.

I think there are two different things here:

Verify/check runs at startup, but in parallel to rest of setup.

The question was to trigger a verify during normal operation.

Two ways I can think of is resetting a projection (or read from $all) and running a scavenge. This would detect any previous problems I think, even if verify was turned off?

/Peter

jen20 · January 25, 2018, 5:17pm

If the checks are disabled, there is little that can be done - there are unsafe options and we somewhat expected people to know when they should and should not be using them.

The best option here is to use a file system which cannot suffer from this kind of thing - ZFS is the only good option.

Poule_Dodue · January 25, 2018, 9:50pm

I set --skip-db-verify to False
I start ES
I go to sleep
At 3AM a neutrino comes from space, enter my SSD and mess with bytes
data is corrupt

in this scenario, does ES continue to run without telling me anything for weeks?

Greg_Young1 · January 25, 2018, 11:30pm

What exactly would you like us to do about lying filesystems? We by default do md5 verification of all data. i am happy to explore further options that you have.

jen20 · January 26, 2018, 9:12am

I don’t understand why you think this is an uncommon situation for a computer? It’s been known for a very long time. https://blogs.oracle.com/bonwick/zfs-end-to-end-data-integrity for example.

mcollins4551 · February 2, 2021, 2:47pm

I’m having a similar issue. Just wanted to see if you ever published your tool. If not, could you make it available to me so that I can fix the writer checkpoint? Thanks.