Hello,
We are seeing a pretty significant delay (sometime up to 14 seconds) in the events delivery for the persistent subscriptions even for a not a big load (~200 events/s).
We are running EventStore in the clustered mode (3 instances) in the Kubernetes (on-prem and AKS), subscribing to the categorized system projection and process each event concurrently. Delay is calculated by the subscriber as a time difference between Created timestamp in the event and the current date – without any load this delay is around 60-100 ms, but if there are a lot of events have been processed prior the load test (600k+) and we start generating ~200 events/s to the subscriber, delay increases exponentially. Event Store UI shows rate ~2000 events/s for the category projection
During the test event store log is full of the following warnings:
[00001,34,14:37:59.984] SLOW QUEUE MSG ["Projection Core #0"]: "UnwrapEnvelopeMessage" - 70ms. Q: 0/252. [00001,135,14:38:00.222] SLOW QUEUE MSG ["StorageReaderQueue #3"]: "ReadStreamEventsForward" - 303ms. Q: 10/41. [00001,34,14:38:00.503] SLOW BUS MSG ["bus"]: "UnwrapEnvelopeMessage" - 72ms. Handler: "UnwrapEnvelopeHandler". [00001,34,14:38:00.503] SLOW QUEUE MSG ["Projection Core #0"]: "UnwrapEnvelopeMessage" - 72ms. Q: 0/253.
Could you please suggest what we might be doing wrong ?
Also do you have any best practices for the server configuration (number of threads, timeouts etc)?
Best regards,
Kirill
Current server configuration:
[00014,01,13:15:24.738] “OS:” Linux (Unix 4.15.0.1077)
[00014,01,13:15:24.762] “RUNTIME:” “5.16.0.220 (tarball Mon Nov 26 17:22:46 UTC 2018)” (64-bit)
[00014,01,13:15:24.764] “GC:” “2 GENERATIONS”
[00014,01,13:15:24.764] “LOGS:” “/var/log/eventstore”
MODIFIED OPTIONS:START STANDARD PROJECTIONS: true (Environment Variable) INT IP: 10.244.0.96 (Environment Variable) EXT IP: 10.244.0.96 (Environment Variable) INT HTTP PORT: 2114 (Environment Variable) EXT HTTP PORT: 2113 (Environment Variable) INT TCP PORT: 1112 (Environment Variable) EXT TCP PORT: 1113 (Environment Variable) INT TCP HEARTBEAT TIMEOUT: 10000 (Environment Variable) EXT TCP HEARTBEAT TIMEOUT: 10000 (Environment Variable) INT TCP HEARTBEAT INTERVAL: 5000 (Environment Variable) EXT TCP HEARTBEAT INTERVAL: 5000 (Environment Variable) CLUSTER SIZE: 1 (Environment Variable) CLUSTER DNS: eventstore.default.svc.cluster.local (Environment Variable) CLUSTER GOSSIP PORT: 2114 (Environment Variable) RUN PROJECTIONS: All (Environment Variable) INT HTTP PREFIXES: http://*:2114/ (Environment Variable) EXT HTTP PREFIXES: http://*:2113/ (Environment Variable) GOSSIP ALLOWED DIFFERENCE MS: 600000 (Environment Variable) ADD INTERFACE PREFIXES: false (Config File)
DEFAULT OPTIONS:
CONFIG: /etc/eventstore/eventstore.conf (<DEFAULT>) HELP: False (<DEFAULT>) VERSION: False (<DEFAULT>) LOG: /var/log/eventstore (<DEFAULT>) DEFINES: <empty> (<DEFAULT>) WHAT IF: False (<DEFAULT>) DISABLE HTTP CACHING: False (<DEFAULT>) MONO MIN THREADPOOL SIZE: 10 (<DEFAULT>) INT SECURE TCP PORT: 0 (<DEFAULT>) EXT SECURE TCP PORT ADVERTISE AS: 0 (<DEFAULT>) EXT SECURE TCP PORT: 0 (<DEFAULT>) EXT IP ADVERTISE AS: <empty> (<DEFAULT>) EXT TCP PORT ADVERTISE AS: 0 (<DEFAULT>) EXT HTTP PORT ADVERTISE AS: 0 (<DEFAULT>) INT IP ADVERTISE AS: <empty> (<DEFAULT>) INT SECURE TCP PORT ADVERTISE AS: 0 (<DEFAULT>) INT TCP PORT ADVERTISE AS: 0 (<DEFAULT>) INT HTTP PORT ADVERTISE AS: 0 (<DEFAULT>) GOSSIP ON SINGLE NODE: False (<DEFAULT>) CONNECTION PENDING SEND BYTES THRESHOLD: 10485760 (<DEFAULT>) CONNECTION QUEUE SIZE THRESHOLD: 50000 (<DEFAULT>) FORCE: False (<DEFAULT>) NODE PRIORITY: 0 (<DEFAULT>) MIN FLUSH DELAY MS: 2 (<DEFAULT>) COMMIT COUNT: -1 (<DEFAULT>) PREPARE COUNT: -1 (<DEFAULT>) ADMIN ON EXT: True (<DEFAULT>) STATS ON EXT: True (<DEFAULT>) GOSSIP ON EXT: True (<DEFAULT>) DISABLE SCAVENGE MERGING: False (<DEFAULT>) SCAVENGE HISTORY MAX AGE: 30 (<DEFAULT>) DISCOVER VIA DNS: True (<DEFAULT>) GOSSIP SEED: <empty> (<DEFAULT>) STATS PERIOD SEC: 30 (<DEFAULT>) CACHED CHUNKS: -1 (<DEFAULT>) READER THREADS COUNT: 4 (<DEFAULT>) CHUNKS CACHE SIZE: 536871424 (<DEFAULT>) MAX MEM TABLE SIZE: 1000000 (<DEFAULT>) HASH COLLISION READ LIMIT: 100 (<DEFAULT>) DB: /var/lib/eventstore (<DEFAULT>) INDEX: <empty> (<DEFAULT>) MEM DB: False (<DEFAULT>) SKIP DB VERIFY: False (<DEFAULT>) WRITE THROUGH: False (<DEFAULT>) UNBUFFERED: False (<DEFAULT>) CHUNK INITIAL READER COUNT: 5 (<DEFAULT>) PROJECTION THREADS: 3 (<DEFAULT>) WORKER THREADS: 5 (<DEFAULT>) PROJECTIONS QUERY EXPIRY: 5 (<DEFAULT>) FAULT OUT OF ORDER PROJECTIONS: False (<DEFAULT>) ENABLE TRUSTED AUTH: False (<DEFAULT>) CERTIFICATE STORE LOCATION: <empty> (<DEFAULT>) CERTIFICATE STORE NAME: <empty> (<DEFAULT>) CERTIFICATE SUBJECT NAME: <empty> (<DEFAULT>) CERTIFICATE THUMBPRINT: <empty> (<DEFAULT>) CERTIFICATE FILE: <empty> (<DEFAULT>) CERTIFICATE PASSWORD: <empty> (<DEFAULT>) USE INTERNAL SSL: False (<DEFAULT>) DISABLE INSECURE TCP: False (<DEFAULT>) SSL TARGET HOST: n/a (<DEFAULT>) SSL VALIDATE SERVER: True (<DEFAULT>) AUTHENTICATION TYPE: internal (<DEFAULT>) AUTHENTICATION CONFIG: <empty> (<DEFAULT>) DISABLE FIRST LEVEL HTTP AUTHORIZATION: False (<DEFAULT>) PREPARE TIMEOUT MS: 2000 (<DEFAULT>) COMMIT TIMEOUT MS: 2000 (<DEFAULT>) UNSAFE DISABLE FLUSH TO DISK: False (<DEFAULT>) BETTER ORDERING: False (<DEFAULT>) UNSAFE IGNORE HARD DELETE: False (<DEFAULT>) SKIP INDEX VERIFY: False (<DEFAULT>) INDEX CACHE DEPTH: 16 (<DEFAULT>) OPTIMIZE INDEX MERGE: False (<DEFAULT>) GOSSIP INTERVAL MS: 1000 (<DEFAULT>) GOSSIP TIMEOUT MS: 500 (<DEFAULT>) ENABLE HISTOGRAMS: False (<DEFAULT>) LOG HTTP REQUESTS: False (<DEFAULT>) LOG FAILED AUTHENTICATION ATTEMPTS: False (<DEFAULT>) ALWAYS KEEP SCAVENGED: False (<DEFAULT>) SKIP INDEX SCAN ON READS: False (<DEFAULT>) REDUCE FILE CACHE PRESSURE: False (<DEFAULT>) INITIALIZATION THREADS: 1 (<DEFAULT>) STRUCTURED LOG: True (<DEFAULT>) MAX AUTO MERGE INDEX LEVEL: 2147483647 (<DEFAULT>) WRITE STATS TO DB: True (<DEFAULT>)