Extremly slow write to SSD

Tomas_Jansson · June 30, 2016, 10:33am

In my team we noticed that the performance differed quite a lot in writes to disk between our desktops. Everyone has a SSD, but not the same brand. When I run the test client with command “wrflw” I get approx. 1700 req/s with a latency of 0.58 ms. When the other members of the team run they get approx. 74 req/s and latency 13.41 ms. We have turned of everything of antivirus scanning and can’t see anything else that competes with resources that justifies that performance drop. Is it anything we need to enable on the slow disks that might be default on the one I have in my machine?

Greg_Young1 · June 30, 2016, 10:57am

wrflw is a waiting version of write flood. eg its writing single threaded and waiting for the previous write to have finished (eg its measuring latency not throughput). Given your ssd is at .58 ms and theirs is at 13 ms it makes sense that they get much less than you do. SSDs can have wildly varying synchronous access times (and many lie!). Note this is not in any way a throughput test as its a single thread with a single blocking writer.

Tomas_Jansson · June 30, 2016, 11:02am

Is it anything we can do that you know of, or do we need a special kind of disk to get decent performance?

Does wrfl also write to disk? Because that was roughly the same between the machines.

Greg_Young1 · June 30, 2016, 11:08am

wrfl is a throughput test not a latency test it runs on multiple
threads and buffers.
wrflw is a single threaded test for latency of disk, it has nothing to
do with throughput.

try running

testclient --write-window=1
wrfl 10 100000

This will do 10 threads running the same test wrflw runs.

Tomas_Jansson · June 30, 2016, 11:38am

Tried that now, and there is a huge difference between hour machines even though they are almost the same (except SSD). The result from my run was:

[07284,05,11:29:52.635] Command exited with code 0.

[07284,07,11:29:52.639] Completed. Successes: 100000, failures: 0 (WRONG VERSION: 0, P: 0, C: 0, F: 0, D: 0)

[07284,07,11:29:52.639] 100000 requests completed in 19711ms (5073,31 reqs per sec).

And from team member it was a factor of 10 slower basically:

[11340,05,11:34:49.497] Command exited with code 0.

[11340,07,11:34:49.497] Completed. Successes: 100000, failures: 0 (WRONG VERSION: 0, P: 0, C: 0, F: 0, D: 0)

[11340,07,11:34:49.497] 100000 requests completed in 184371ms (542,38 reqs per sec).

Any ideas of what we can do to improve the performance? If we run a regular disk write test the performance is as it should.

Greg_Young1 · June 30, 2016, 11:40am

You are not getting my point. You re doing a latency test not a
throughput test. Note both yours and theirs scaled with #of concurrent
threads.

If you set a write window of say 1000 and do a wrfl 10 100000 my guess
is the differences will go away (throughput test).

btw likely your disk is lying about durability.

Tomas_Jansson · June 30, 2016, 12:02pm

I did get your point, and the result above is from running wrfl 10 100000 with a write window of 1 as you wrote in the previous comment. Setting the write window to 1000 removed some of the difference, but mine is still twice as fast (20.000 req/s vs 10.000 req/s).

Is write window something you can configure a client with?

I’m not a disk expert, so yeah, maybe it is lying :). That is probably another discussion, I just want to get to the bottom of why I get more than a factor of 10 faster writes when we run the application and try to migrate a bunch of data to event store.

Greg_Young1 · June 30, 2016, 12:03pm

because you are writing single threaded and blocking. you are leaving
both your client and ES itself doing nothing for a large period of
time.

Greg_Young1 · June 30, 2016, 12:16pm

You may also want to consider batching writes as opposed to doing one
at a time if single threaded.

Tomas_Jansson · June 30, 2016, 12:23pm

But why does my machine still have such a high throughput? Is it because the disk is lying about durability?

Is it possible to do batch writes to multiple streams? Didn’t find it in the api for the 3.3 client.

Greg_Young1 · June 30, 2016, 12:26pm

You dont have "such a high throughput" you have a "much lower latency"
there is a difference. As to whether or not your disk is ltying about
durability I would be inclined to say most likely (especially if a
laptop).

There are many ways of batching writes. You can batch many per stream.
You can also become async during your update eg not blocking on every
write but handling possible failure asynchronously (eg building a
buffer in the client) this will help saturate the node (eg the
difference between wrfl with writewindow of 1 and 1000 on testclient
even if off a single thread).

Tomas_Jansson · June 30, 2016, 12:36pm

Thank you. We probably have enough to figure out what to do. I’m sitting on a desktop machine, but you are probably right that is my disk that is lying since there are three other that has the same issue.

One question though. If you write async to event store, will event store preserve the ordering of the events? I guess not since I can’t imagine how it could do so, but maybe if it is the same client though.

Greg_Young1 · June 30, 2016, 12:38pm

Through the same client it should try option --better-ordering it
should have > 99% of the cases