Invalid order of events has been detected

Has anyone encountered this before…?

Got a 500 response in the UI on the streams view when it /streams/$streams/head/50?embed=rich

Invalid order of events has been detected in read index for the event stream ‘$streams’. The event 3868 at position 42014713862 goes after the event 3872 at position 47573233473

  • What version of Event Store are you running?

  • Can you provide us with the most recent logs up to the point after it’s rebuilt the indexes and loaded the PTables?

  • Are there any other entries in your logs that show invalid order of events?

When I asked for the logs, I wanted to see the logs where Event Store started up and output it’s effective configuration etc etc. Apologies if that wasn’t clear in the initial reply.

Rebuilding your indexes would likely resolve the issue, however you would need to upgrade to upgrade to 3.9.3 as there were a set of bug fixes related to index merges between 32bit and 64bit indexes. 64bit indexes were introduced in 3.9.0.

Running 3.9.3 - and actually had a problem with slow loading of PTables on restart of a node.

https://groups.google.com/forum/#!topic/event-store/JIZDrSa-WM8

Restarted a node, and the log is here

[PID:31848:001 2017.02.24 12:53:23.810 INFO ProgramBase`1 ]

ES VERSION: 3.9.3.0 (oss-v3.9.3/64fbfb52500348f0ec37516adbc2eef7a918221f, Wed, 30 Nov 2016 13:49:53 +0200)

OS: Windows (Microsoft Windows NT 6.2.9200.0)

RUNTIME: .NET 4.0.30319.42000 (64-bit)

GC: 3 GENERATIONS

LOGS: E:\EventStore\logs\log-EventStoreNode2

MODIFIED OPTIONS:

DB: E:\EventStore\data-EventStoreNode2 (Command Line)

LOG: E:\EventStore\logs\log-EventStoreNode2 (Command Line)

INT IP: 10.0.38.14 (Command Line)

EXT IP: 10.0.38.14 (Command Line)

INT TCP PORT: 2111 (Command Line)

EXT TCP PORT: 2112 (Command Line)

INT HTTP PORT: 2113 (Command Line)

EXT HTTP PORT: 2114 (Command Line)

CLUSTER SIZE: 3 (Command Line)

DISCOVER VIA DNS: false (Command Line)

GOSSIP SEED: 10.0.38.13:1113 (Command Line)

DISABLE HTTP CACHING: True (Command Line)

RUN PROJECTIONS: all (Command Line)

START STANDARD PROJECTIONS: true (Command Line)

SKIP DB VERIFY: true (Command Line)

DEFAULT OPTIONS:

HELP: False ()

VERSION: False ()

CONFIG: ()

DEFINES: ()

WHAT IF: False ()

MONO MIN THREADPOOL SIZE: 10 ()

INT SECURE TCP PORT: 0 ()

EXT SECURE TCP PORT ADVERTISE AS: 0 ()

EXT SECURE TCP PORT: 0 ()

EXT IP ADVERTISE AS: ()

EXT TCP PORT ADVERTISE AS: 0 ()

EXT HTTP PORT ADVERTISE AS: 0 ()

INT IP ADVERTISE AS: ()

INT SECURE TCP PORT ADVERTISE AS: 0 ()

INT TCP PORT ADVERTISE AS: 0 ()

INT HTTP PORT ADVERTISE AS: 0 ()

INT TCP HEARTBEAT TIMEOUT: 700 ()

EXT TCP HEARTBEAT TIMEOUT: 1000 ()

INT TCP HEARTBEAT INTERVAL: 700 ()

EXT TCP HEARTBEAT INTERVAL: 2000 ()

FORCE: False ()

NODE PRIORITY: 0 ()

MIN FLUSH DELAY MS: 2 ()

COMMIT COUNT: -1 ()

PREPARE COUNT: -1 ()

ADMIN ON EXT: True ()

STATS ON EXT: True ()

GOSSIP ON EXT: True ()

DISABLE SCAVENGE MERGING: False ()

SCAVENGE HISTORY MAX AGE: 30 ()

CLUSTER DNS: fake.dns ()

CLUSTER GOSSIP PORT: 30777 ()

STATS PERIOD SEC: 30 ()

CACHED CHUNKS: -1 ()

READER THREADS COUNT: 4 ()

CHUNKS CACHE SIZE: 536871424 ()

MAX MEM TABLE SIZE: 1000000 ()

HASH COLLISION READ LIMIT: 100 ()

INDEX: ()

MEM DB: False ()

PROJECTION THREADS: 3 ()

WORKER THREADS: 5 ()

INT HTTP PREFIXES: ()

EXT HTTP PREFIXES: ()

ENABLE TRUSTED AUTH: False ()

ADD INTERFACE PREFIXES: True ()

CERTIFICATE STORE LOCATION: ()

CERTIFICATE STORE NAME: ()

CERTIFICATE SUBJECT NAME: ()

CERTIFICATE THUMBPRINT: ()

CERTIFICATE FILE: ()

CERTIFICATE PASSWORD: ()

USE INTERNAL SSL: False ()

SSL TARGET HOST: n/a ()

SSL VALIDATE SERVER: True ()

AUTHENTICATION TYPE: internal ()

AUTHENTICATION CONFIG: ()

PREPARE TIMEOUT MS: 2000 ()

COMMIT TIMEOUT MS: 2000 ()

UNSAFE DISABLE FLUSH TO DISK: False ()

BETTER ORDERING: False ()

UNSAFE IGNORE HARD DELETE: False ()

INDEX CACHE DEPTH: 16 ()

GOSSIP INTERVAL MS: 1000 ()

GOSSIP ALLOWED DIFFERENCE MS: 60000 ()

GOSSIP TIMEOUT MS: 500 ()

ENABLE HISTOGRAMS: False ()

LOG HTTP REQUESTS: False ()

ALWAYS KEEP SCAVENGED: False ()

[PID:31848:001 2017.02.24 12:53:24.244 INFO ProgramBase`1 ] Quorum size set to 2

[PID:31848:001 2017.02.24 12:53:24.283 INFO ProgramBase`1 ] Cannot find plugins path: E:\EventStore\EventStore-3.9.3\plugins

[PID:31848:001 2017.02.24 12:53:25.079 INFO VNodeBuilder ] INSTANCE ID: 2a2c23ea-1ddb-4ff3-a113-8f44e344cee4

[PID:31848:001 2017.02.24 12:53:25.079 INFO VNodeBuilder ] DATABASE: E:\EventStore\data-EventStoreNode2

[PID:31848:001 2017.02.24 12:53:25.079 INFO VNodeBuilder ] WRITER CHECKPOINT: 646745645846 (0x96950B6316)

[PID:31848:001 2017.02.24 12:53:25.079 INFO VNodeBuilder ] CHASER CHECKPOINT: 646745645846 (0x96950B6316)

[PID:31848:001 2017.02.24 12:53:25.079 INFO VNodeBuilder ] EPOCH CHECKPOINT: 643483538990 (0x95D29B962E)

[PID:31848:001 2017.02.24 12:53:25.079 INFO VNodeBuilder ] TRUNCATE CHECKPOINT: -1 (0xFFFFFFFFFFFFFFFF)

[PID:31848:001 2017.02.24 12:53:27.930 TRACE MessageHierarchy ] MessageHierarchy initialization took 00:00:02.7190923.

[PID:31848:001 2017.02.24 13:03:23.457 TRACE TFChunkDb ] Deleting temporary file E:\EventStore\data-EventStoreNode2\d945240d-b6c1-4bcb-a803-72f5b2ab2fb8.scavenge.tmp…

[PID:31848:001 2017.02.24 13:03:23.592 TRACE TFChunk ] CACHED TFChunk #2409-2409 (chunk-002409.000000) in 00:00:00.1061160.

[PID:31848:009 2017.02.24 13:03:27.362 TRACE TFChunk ] CACHED TFChunk #2408-2408 (chunk-002408.000000) in 00:00:03.7679754.

[PID:31848:001 2017.02.24 13:03:28.557 INFO MiniWeb ] Starting MiniWeb for /web/es/js/projections ==> E:\EventStore\EventStore-3.9.3\projections

[PID:31848:001 2017.02.24 13:03:28.557 INFO MiniWeb ] Starting MiniWeb for /web/es/js/projections/v8/Prelude ==> E:\EventStore\EventStore-3.9.3\Prelude

[PID:31848:001 2017.02.24 13:03:28.557 TRACE MiniWeb ] Binding MiniWeb to /web/es/js/projections/{*remaining_path}

[PID:31848:001 2017.02.24 13:03:28.557 TRACE MiniWeb ] Binding MiniWeb to /web/es/js/projections/v8/Prelude/{*remaining_path}

[PID:31848:001 2017.02.24 13:03:28.573 TRACE MiniWeb ] Binding MiniWeb to /web/es/js/projections/{*remaining_path}

[PID:31848:001 2017.02.24 13:03:28.573 TRACE MiniWeb ] Binding MiniWeb to /web/es/js/projections/v8/Prelude/{*remaining_path}

[PID:31848:001 2017.02.24 13:03:28.603 INFO MiniWeb ] Starting MiniWeb for /web ==> E:\EventStore\EventStore-3.9.3\clusternode-web

[PID:31848:001 2017.02.24 13:03:28.603 TRACE MiniWeb ] Binding MiniWeb to /web/{*remaining_path}

[PID:31848:001 2017.02.24 13:03:28.603 INFO MiniWeb ] Starting MiniWeb for /web ==> E:\EventStore\EventStore-3.9.3\clusternode-web

[PID:31848:001 2017.02.24 13:03:28.603 TRACE MiniWeb ] Binding MiniWeb to /web/{*remaining_path}

[PID:31848:014 2017.02.24 13:03:28.630 INFO ClusterVNodeControll] ========== [10.0.38.14:2113] SYSTEM INIT…

[PID:31848:014 2017.02.24 13:03:28.677 INFO TcpServerListener ] Starting Normal TCP listening on TCP endpoint: 10.0.38.14:2112.

[PID:31848:015 2017.02.24 13:03:28.699 INFO IndexCommitter ] TableIndex initialization…

[PID:31848:014 2017.02.24 13:03:28.720 TRACE InMemoryBus ] SLOW BUS MSG [MainBus]: SystemInit - 60ms. Handler: TcpService.

[PID:31848:014 2017.02.24 13:03:28.720 INFO TcpServerListener ] Starting Normal TCP listening on TCP endpoint: 10.0.38.14:2111.

[PID:31848:015 2017.02.24 13:03:28.753 TRACE PTable ] Loading and Verification of PTable ‘10bcea24-0299-469f-b47b-fe8b19f14c9a’ started…

[PID:31848:014 2017.02.24 13:03:28.817 INFO HttpAsyncServer ] Starting HTTP server on [http://10.0.38.14:2114/]…

[PID:31848:014 2017.02.24 13:03:28.838 INFO HttpAsyncServer ] HTTP server is up and listening on [http://10.0.38.14:2114/]

[PID:31848:014 2017.02.24 13:03:28.838 TRACE InMemoryBus ] SLOW BUS MSG [MainBus]: SystemInit - 119ms. Handler: HttpService.

[PID:31848:014 2017.02.24 13:03:28.838 INFO HttpAsyncServer ] Starting HTTP server on [http://10.0.38.14:2113/]…

[PID:31848:014 2017.02.24 13:03:28.838 INFO HttpAsyncServer ] HTTP server is up and listening on [http://10.0.38.14:2113/]

[PID:31848:014 2017.02.24 13:03:28.867 TRACE QueuedHandlerMRES ] SLOW QUEUE MSG [MainQueue]: SystemInit - 241ms. Q: 0/6.

[PID:31848:014 2017.02.24 13:03:28.867 INFO ClusterVNodeControll] ========== [10.0.38.14:2113] Service ‘StorageReader’ initialized.

[PID:31848:014 2017.02.24 13:03:28.867 INFO ClusterVNodeControll] ========== [10.0.38.14:2113] Service ‘StorageWriter’ initialized.

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] CLUSTER HAS CHANGED

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] Old:

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Unknown, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:28.842

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] New:

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Unknown, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:28.842

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] MAN {00000000-0000-0000-0000-000000000000} [Manager, 10.0.38.13:1113, 10.0.38.13:1113] | 2017-02-24 13:03:28.873

[PID:31848:014 2017.02.24 13:03:28.904 TRACE GossipServiceBase ] --------------------------------------------------------------------------------

[PID:31848:008 2017.02.24 13:03:29.310 TRACE InMemoryBus ] SLOW BUS MSG [Worker #1 Bus]: SendOverHttp - 396ms. Handler: HttpSendService.

[PID:31848:006 2017.02.24 13:03:29.310 TRACE InMemoryBus ] SLOW BUS MSG [Worker #2 Bus]: SendOverHttp - 292ms. Handler: HttpSendService.

[PID:31848:008 2017.02.24 13:03:29.339 TRACE QueuedHandlerThreadP] SLOW QUEUE MSG [Worker #1]: SendOverHttp - 396ms. Q: 0/0.

[PID:31848:006 2017.02.24 13:03:29.339 TRACE QueuedHandlerThreadP] SLOW QUEUE MSG [Worker #2]: SendOverHttp - 292ms. Q: 0/0.

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] CLUSTER HAS CHANGED (gossip received from [10.0.38.13:1113])

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] Old:

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:29.664

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] MAN {00000000-0000-0000-0000-000000000000} [Manager, 10.0.38.13:1113, 10.0.38.13:1113] | 2017-02-24 13:03:28.873

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] New:

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [Slave, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 647965620640/647965628906/647965620973/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:28.996

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:29.675

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 647966135740/647966136087/647966136087/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:29.286

[PID:31848:014 2017.02.24 13:03:29.676 TRACE GossipServiceBase ] --------------------------------------------------------------------------------

[PID:31848:015 2017.02.24 13:03:31.106 TRACE PTable ] Loading PTable (Version: 2) ‘10bcea24-0299-469f-b47b-fe8b19f14c9a’ (1000241 entries, cache depth 16) done in 00:00:02.3532578.

[PID:31848:015 2017.02.24 13:03:31.106 TRACE PTable ] Loading and Verification of PTable ‘19f5aeeb-9bc2-40ac-b85a-7d58945f3aa6’ started…

[PID:31848:015 2017.02.24 13:03:34.342 TRACE PTable ] Loading PTable (Version: 2) ‘19f5aeeb-9bc2-40ac-b85a-7d58945f3aa6’ (2000359 entries, cache depth 16) done in 00:00:03.2334558.

[PID:31848:015 2017.02.24 13:03:34.342 TRACE PTable ] Loading and Verification of PTable ‘33fcc1c5-94e9-4f5a-9e9a-a944236d76a2’ started…

[PID:31848:007 2017.02.24 13:03:34.418 TRACE QueuedHandlerThreadP] SLOW QUEUE MSG [MonitoringQueue]: SystemInit - 5778ms. Q: 0/0.

[PID:31848:015 2017.02.24 13:03:40.191 TRACE PTable ] Loading PTable (Version: 2) ‘33fcc1c5-94e9-4f5a-9e9a-a944236d76a2’ (4000865 entries, cache depth 16) done in 00:00:05.8489470.

[PID:31848:015 2017.02.24 13:03:40.191 TRACE PTable ] Loading and Verification of PTable ‘4f64e409-50c0-4c99-8d25-d4e1c71913a8’ started…

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] CLUSTER HAS CHANGED (gossip received from [10.0.38.15:3113])

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] Old:

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [Slave, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.005

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.997

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:45.119

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] New:

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [Slave, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.769

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:48.000

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.387

[PID:31848:014 2017.02.24 13:03:48.000 TRACE GossipServiceBase ] --------------------------------------------------------------------------------

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] CLUSTER HAS CHANGED (gossip received from [10.0.38.15:3113])

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] Old:

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [Slave, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.769

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:48.000

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.387

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] New:

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [PreReplica, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:48.006

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:48.237

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.892

[PID:31848:014 2017.02.24 13:03:48.237 TRACE GossipServiceBase ] --------------------------------------------------------------------------------

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] Looks like node [10.0.38.13:1113] is DEAD (Gossip send failed).

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] CLUSTER HAS CHANGED (gossip send failed to [10.0.38.13:1113])

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] Old:

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [PreReplica, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:50.007

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:51.000

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:47.892

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] New:

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [PreReplica, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:50.007

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:51.000

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:51.600

[PID:31848:014 2017.02.24 13:03:51.599 TRACE GossipServiceBase ] --------------------------------------------------------------------------------

[PID:31848:015 2017.02.24 13:03:52.519 TRACE PTable ] Loading PTable (Version: 2) ‘4f64e409-50c0-4c99-8d25-d4e1c71913a8’ (8002199 entries, cache depth 16) done in 00:00:12.3272827.

[PID:31848:015 2017.02.24 13:03:52.519 TRACE PTable ] Loading and Verification of PTable ‘0501b8ae-436c-48d8-ac7c-4ee7380798ec’ started…

[PID:31848:015 2017.02.24 13:04:17.778 TRACE PTable ] Loading PTable (Version: 2) ‘0501b8ae-436c-48d8-ac7c-4ee7380798ec’ (16004020 entries, cache depth 16) done in 00:00:25.2585920.

[PID:31848:015 2017.02.24 13:04:17.778 TRACE PTable ] Loading and Verification of PTable ‘f1f59b71-ac90-45f2-8e0b-1838f1aa23df’ started…

[PID:31848:015 2017.02.24 13:05:10.530 TRACE PTable ] Loading PTable (Version: 2) ‘f1f59b71-ac90-45f2-8e0b-1838f1aa23df’ (32007459 entries, cache depth 16) done in 00:00:52.7511587.

[PID:31848:015 2017.02.24 13:05:10.530 TRACE PTable ] Loading and Verification of PTable ‘e6bd86c6-98b0-448c-a6a9-ccc04935494e’ started…

[PID:31848:015 2017.02.24 13:06:57.938 TRACE PTable ] Loading PTable (Version: 2) ‘e6bd86c6-98b0-448c-a6a9-ccc04935494e’ (64014785 entries, cache depth 18) done in 00:01:47.4053950.

[PID:31848:015 2017.02.24 13:06:57.938 TRACE PTable ] Loading and Verification of PTable ‘5de6ee31-4c16-463e-bd01-8f688bb6d17b’ started…

[PID:31848:015 2017.02.24 13:10:27.300 TRACE PTable ] Loading PTable (Version: 2) ‘5de6ee31-4c16-463e-bd01-8f688bb6d17b’ (128029013 entries, cache depth 19) done in 00:03:29.3572783.

[PID:31848:015 2017.02.24 13:10:27.300 TRACE PTable ] Loading and Verification of PTable ‘d03afbe5-76c7-4799-8cfc-42bf1edd9ae6’ started…

[PID:31848:015 2017.02.24 13:19:19.271 TRACE PTable ] Loading PTable (Version: 2) ‘d03afbe5-76c7-4799-8cfc-42bf1edd9ae6’ (256060873 entries, cache depth 20) done in 00:08:51.9606324.

[PID:31848:015 2017.02.24 13:19:19.271 TRACE PTable ] Loading and Verification of PTable ‘31cafd23-862f-46ab-af36-0d20bf2dfb5a’ started…

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] CLUSTER HAS CHANGED (gossip received from [])

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] Old:

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [PreReplica, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:33:51.154

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:33:51.404

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] VND {5a3459c8-c065-45a0-9848-ff4b80582f91} [Master, 10.0.38.13:1111, n/a, 10.0.38.13:1112, n/a, 10.0.38.13:1113, 10.0.38.13:1114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:03:51.625

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] New:

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] VND {7704a94a-590c-4dff-baf2-22d63ad9957f} [PreReplica, 10.0.38.15:3111, n/a, 10.0.38.15:3112, n/a, 10.0.38.15:3113, 10.0.38.15:3114] 648003075317/648003190784/648003190784/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:33:51.154

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] VND {2a2c23ea-1ddb-4ff3-a113-8f44e344cee4} [Initializing, 10.0.38.14:2111, 10.0.38.14:0, 10.0.38.14:2112, 10.0.38.14:0, 10.0.38.14:2113, 10.0.38.14:2114] -1/646745645846/646745645846/E667@643483538990:{c9acbca3-5ab9-4fce-95f7-289111714e15} | 2017-02-24 13:33:51.643

[PID:31848:014 2017.02.24 13:33:51.643 TRACE GossipServiceBase ] --------------------------------------------------------------------------------

[PID:31848:015 2017.02.24 13:39:19.826 TRACE PTable ] Loading PTable (Version: 2) ‘31cafd23-862f-46ab-af36-0d20bf2dfb5a’ (512071915 entries, cache depth 21) done in 00:20:00.5314237.

[PID:31848:015 2017.02.24 13:39:19.826 TRACE PTable ] Loading and Verification of PTable ‘7b44e437-8bda-479f-90e7-5350422b2b34’ started…

I am not sure this is "slow" it appears to be 8 minutes to load and
verify a 256m entry index and about 3.5 minutes for a 128m index file.
What are the file sizes on these (I imagine pretty large!). What type
of storage is this running on? Was this off of a cold boot where they
might not be in cache but normally are loaded from cache?

Hi Anders,
Just wanted to follow up regarding this and our suggested resolution to the issue. Have you managed to start rebuilding the indexes?

You can find some instructions on how to do that here

Thanks for the follow up Pieter - I am in the process of doing a scavenge on all the nodes, as some data has marked for retention. this takes quite a while though. Large database, and non SSD disks maybe… there are 3300 chunks and each chunk seems to take about 7 minuttes… It seems to be merge PTables at the same time… After this I will try to rebuild the index

Working on the same Cluster as Anders…

Projections are running behind and so are Scavenges it would seem:

10.0.38.15:3114
34ee8290-6a72-44e0-8a05-6792252bced7

2017-02-27 07:20
Not Finished

10.0.38.13:1114
101669e9-beb1-4a20-a68c-882b12248dbd

2017-02-27 07:17
Not Finished

10.0.38.14:2114
e36736a0-2b74-4564-9293-cacd98bd03b2

2017-02-24 17:13
Not Finished

10.0.38.15:3114
d57704f9-3931-40c4-bb1f-bef5f5fabcf7

2017-02-21 01:00
Not Finished

10.0.38.15:3114
801db6ec-101f-49c6-868e-a10a534629e5

2017-02-15 01:00
2017-02-20 06:06
Success
10.0.38.14:2114
6a0ae893-9feb-42ff-bdc4-c373c4e89f93

2017-02-20 01:00
Not Finished

10.0.38.13:1114
9581d3ee-b81e-4c65-ab65-cb6dcb6afe74

2017-02-20 01:00
Not Finished

10.0.38.13:1114
c4e94e9e-27cc-4403-a768-78b04d9e01d6

2017-02-15 01:00
2017-02-19 18:15
Success
10.0.38.14:2114
43e423f8-e73e-471e-a2db-d354a522f929

2017-02-17 01:00
Not Finished

10.0.38.14:2114
cbb92afc-f75a-4ce9-a960-352b19ee2ec2

2017-02-15 01:00
Not Finished

10.0.38.13:1114
00877c06-4662-4a4c-af4a-3241afa8a473

2017-02-12 01:00
Not Finished

10.0.38.15:3114
d98b7b31-4166-4dd1-832d-ace450d4a6af

2017-02-12 01:00
Not Finished

10.0.38.15:3114
9ccb7d40-c343-4d45-9ab1-ad95dacf0275

2017-02-08 01:00
2017-02-11 18:44
Success
10.0.38.13:1114
14198645-5573-4d07-a046-80737ddeeebb

2017-02-08 01:00
2017-02-11 10:30
Success
10.0.38.14:2114
694fe0d9-e04a-44c3-8013-e9524483b1cd

2017-02-11 01:00
Not Finished

Memory Mapped files of two indexes are taking up 11 gigabytes of RAM…

The cluster is running on VM’s with some unknown, not fast storage…

Is this a case of scavenging not being able to catch up with all the data we are feeding the cluster?

Solution might be to either move to SSD’s or simply have lower retention on some of the data?

if you click through you will see status of the scavenge, what is it?

Here’s one that is labelled as ‘Unfinished’:
It’s about half way through…

Scavenge 10.0.38.13:1114

9581d3ee-b81e-4c65-ab65-cb6dcb6afe74

  • Next Previous

Status
Space Saved (bytes)
Time Taken
Result
Scavenging chunks 1525 - 1525 complete
0
00:13:06.4590644
No chunks scavenged
Scavenging chunks 1524 - 1524 complete
0
00:13:00.6613432
No chunks scavenged
Scavenging chunks 1523 - 1523 complete
0
00:12:19.1160682
No chunks scavenged
Scavenging chunks 1522 - 1522 complete
0
00:12:49.5455594
No chunks scavenged
Scavenging chunks 1521 - 1521 complete
0
00:12:11.4693753
No chunks scavenged
Scavenging chunks 1520 - 1520 complete
0
00:12:34.6111482
No chunks scavenged
Scavenging chunks 1519 - 1519 complete
0
00:12:03.2272822
No chunks scavenged
Scavenging chunks 1518 - 1518 complete
0
00:12:05.6755757
No chunks scavenged
Scavenging chunks 1517 - 1517 complete
824,928
00:07:00.7840943
1 chunk(s) scavenged
Scavenging chunks 1516 - 1516 complete
37,064,366
00:03:41.5393536
1 chunk(s) scavenged

Category projection is starting and stopping all the time:

Name
Status
Mode
Done
Read / Write in Progress
Write Queues
Partitions Cached
Rate (events/s)
Events
Processed
Buffered
$by_category
Running/PausedRequested
Continuous
53.1%
0 / 1
1 / 0
1
3493.0
2

[PID:55460:008 2017.02.28 09:09:05.196 TRACE CoreProjectionCheckp] Writing checkpoint for $by_category at C:484777901271/P:484777901271 with expected version number 174633

[PID:55460:008 2017.02.28 09:09:04.913 TRACE CoreProjectionCheckp] Checkpoint has been written for projection $by_category at sequence number 174633 (current)

[PID:55460:008 2017.02.28 09:09:04.910 TRACE CoreProjectionCheckp] Writing checkpoint for $by_category at C:484776774100/P:484776774100 with expected version number 174632

[PID:55460:008 2017.02.28 09:09:04.664 TRACE CoreProjectionCheckp] Writing checkpoint for $by_category at C:484775650446/P:484775650446 with expected version number 174631

And a lot of these:

“Failed to write events to stream $ce-Turbine10MinuteDataTac1. Error: CommitTimeout”

There’s quite a bit going on here, some of the issues have been addressed in 4.0.0 and the one in particular regarding the projection starting and stopping is the reader merely backing off because it can’t process events fast enough.
The issue regarding the projection failing to write events due to timeouts has been addressed and the projection will be faulted after a certain number of retries.

Just to clarify, the issue is that the projection would infinitely retry writing those events that have failed due to the commit timeout.

After updating to 4-rc1 the scavenge process seems to have problems - it fails after a few chunks for all nodes with the same error:

Failed: Error while scavenging DB: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process…

From the log:

[PID:35596:044 2017.03.01 17:21:24.350 TRACE TFChunkScavenger ] SCAVENGING: started to scavenge & merge chunks: #2-3 (chunk-000002.000003)

[PID:35596:044 2017.03.01 17:21:24.350 TRACE TFChunkScavenger ] Resulting temp chunk file: e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp.

[PID:35596:044 2017.03.01 17:22:11.878 INFO TFChunkScavenger ] Got exception while scavenging chunk: #2-3. This chunk will be skipped

Exception: System.Exception: Data sizes violation. Chunk: E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp, IsScavenged: True, LogicalDataSize: 430, PhysicalDataSize: 434.

at EventStore.Core.TransactionLog.Chunks.TFChunk.TFChunk.TryAppend(LogRecord record, Boolean isUpgrade) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunk\TFChunk.cs:line 775

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.WriteRecord(TFChunk newChunk, LogRecord record, Int32 lengthOffset, Boolean isUpgrade) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 554

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.<>c__DisplayClass12.b__3(PrepareLogRecord prepare) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 203

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.TraverseChunk(TFChunk chunk, Action1 processPrepare, Action1 processCommit, Action`1 processSystem) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 505

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.ScavengeChunks(Boolean alwaysKeepScavenged, IList`1 oldChunks, Int64& spaceSaved) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 188.

[PID:35596:044 2017.03.01 17:22:11.878 ERROR TFChunkScavenger ] Failed to delete the temp chunk. Retrying 0/5. Reason: System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 296

[PID:35596:044 2017.03.01 17:22:11.878 ERROR TFChunkScavenger ] Failed to delete the temp chunk. Retrying 1/5. Reason: System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 296

[PID:35596:044 2017.03.01 17:22:11.878 ERROR TFChunkScavenger ] Failed to delete the temp chunk. Retrying 2/5. Reason: System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 296

[PID:35596:044 2017.03.01 17:22:11.878 ERROR TFChunkScavenger ] Failed to delete the temp chunk. Retrying 3/5. Reason: System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 296

[PID:35596:044 2017.03.01 17:22:11.878 ERROR TFChunkScavenger ] Failed to delete the temp chunk. Retrying 4/5. Reason: System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 296

[PID:35596:044 2017.03.01 17:22:11.878 ERROR TFChunkScavenger ] Failed to delete the temp chunk. Retry limit of 5 reached. Reason: System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 296

[PID:35596:044 2017.03.01 17:22:11.878 ERROR StorageScavenger ] SCAVENGING: error while scavenging DB.

System.IO.IOException: The process cannot access the file ‘E:\EventStore\data-EventStoreNode1\e47925cb-16e6-4024-b46f-90b123ed95df.scavenge.tmp’ because it is being used by another process.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)

at System.IO.File.InternalDelete(String path, Boolean checkHost)

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 308

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 303

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 303

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 303

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 303

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.DeleteTempChunk(String tmpChunkPath, Int32 retries) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 303

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.ScavengeChunks(Boolean alwaysKeepScavenged, IList`1 oldChunks, Int64& spaceSaved) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 286

at EventStore.Core.TransactionLog.Chunks.TFChunkScavenger.Scavenge(Boolean alwaysKeepScavenged, Boolean mergeChunks) in c:\projects\eventstore\src\EventStore.Core\TransactionLog\Chunks\TFChunkScavenger.cs:line 74

at EventStore.Core.Services.Storage.StorageScavenger.Scavenge(ScavengeDatabase message) in c:\projects\eventstore\src\EventStore.Core\Services\Storage\StorageScavenger.cs:line 99

Scavenge
Started
Completed
Result
10.0.38.13:1114
d19009a8-e2c0-4f6a-8589-3036766fbb43

2017-03-01 18:17
2017-03-01 18:22
Failed
10.0.38.15:3114
2ce74ef7-f5f2-4628-9137-5bbb5a524a10

2017-03-01 17:49
2017-03-01 18:13
Failed
10.0.38.14:2114
3c0c54f1-3ee8-4f3c-8269-56b5fd9b216e

2017-03-01 17:49
2017-03-01 18:12
Failed
10.0.38.13:1114
929dfd2c-8f91-4471-9432-68d1282c6a21

2017-03-01 17:50
2017-03-01 18:11
Failed

Hi Anders, thanks for raising this issue. This is being looked at currently.

You can expect a patch tomorrow. Apologies for the inconvenience.

That’s great Pieter.

With 4-rc1 we are still seing a lot of CommitTimeouts when running the category projection:

March 2nd 2017, 09:03:38.020

message:[PID:57704:032 2017.03.02 08:03:38.020 INFO CoreProjectionCheckp] Failed to write events to stream $ce-Turbine10MinuteDataTac2. Error: CommitTimeout @version:1 @timestamp:March 2nd 2017, 09:03:38.020 beat.hostname:dkcdcesd03 beat.name:dkcdcesd03 count:1 fields: - input_type:log offset:51,500,432 source:E:\EventStore\logs\log-EventStoreNode3\2017-03-02\10.0.38.15-3114-cluster-node.log type:EventStore log host:dkcdcesd03 tags:beats_input_codec_plain_applied _id:AVqOC8Ukl0kY62kKhShD _type:EventStore log _index:eventstore-logs-2017.03.02 _score:

 March 2nd 2017, 09:03:35.018

message:[PID:57704:032 2017.03.02 08:03:35.018 INFO CoreProjectionCheckp] Failed to write projection checkpoint to stream $projections-$by_category-checkpoint. Error: CommitTimeout @version:1 @timestamp:March 2nd 2017, 09:03:35.018 beat.hostname:dkcdcesd03 beat.name:dkcdcesd03 count:1 fields: - input_type:log offset:51,499,147 source:E:\EventStore\logs\log-EventStoreNode3\2017-03-02\10.0.38.15-3114-cluster-node.log type:EventStore log host:dkcdcesd03 tags:beats_input_codec_plain_applied _id:AVqOC8Qsl0kY62kKhSg8 _type:EventStore log _index:eventstore-logs-2017.03.02 _score:

 March 2nd 2017, 09:03:30.006

message:[PID:57704:032 2017.03.02 08:03:30.006 INFO CoreProjectionCheckp] Failed to write events to stream $ce-Turbine10MinuteDataTac1. Error: CommitTimeout @version:1 @timestamp:March 2nd 2017, 09:03:30.006 beat.hostname:dkcdcesd03 beat.name:dkcdcesd03 count:1 fields: - input_type:log offset:51,497,055 source:E:\EventStore\logs\log-EventStoreNode3\2017-03-02\10.0.38.15-3114-cluster-node.log type:EventStore log host:dkcdcesd03 tags:beats_input_codec_plain_applied _id:AVqOC6lal0kY62kKhSfZ _type:EventStore log _index:eventstore-logs-2017.03.02 _score: