Hi there,
I’m doing some perf testing to measure writing events to EventStore, and I’m running into a curious behavior.
I’m writing 10K batches of 10 events to 10 different streams, using the AppendToStreamAsync method in the TCP API.
I’m seeing errors like this one in the EventStore console output:
[PID:06776:017 2015.05.07 20:13:21.601 INFO CoreProjectionCheckp] Failed to write projection checkpoint to stream $$$projections-$users-checkpoint. Error: CommitTimeout
When I read the data back in, the streams have more events on them than I wrote to them (I’m writing 100K events to each stream, but I’m seeing ~150K events, the number varies). Some events are being duplicated.
Can you help me understand this behavior?
Thanks,
Anne
Not saying u cannot run but for perf it causes write amplification
Make sense.
I’m still seeing duplicate events when I turn off projections. Let me know how to address this.
Thanks,
Anne
Can you show the code please?
public static void Run()
{
int numConnections = 10;
int numBatches = 10000;
int numEventsPerBatch = 10;
int numEvents = numBatches * numEventsPerBatch;
Console.WriteLine("Connections: " + numConnections + ", Events per connection: " + numEvents);
Console.WriteLine("Event batches: " + numBatches + ", events per batch: " + numEventsPerBatch);
Console.WriteLine("Total events: " + (numConnections * numEvents));
IEventStoreConnection[] connections = new IEventStoreConnection[numConnections];
for (int i = 0; i < numConnections; i++)
{
connections[i] = EventStoreConnection.Create(/*settings,*/ new IPEndPoint(IPAddress.Loopback, 1113));
connections[i].ConnectAsync().Wait();
}
List<EventData[]> eventBatches = createBatches(numBatches, numEventsPerBatch);
Stopwatch s = new Stopwatch();
s.Start();
Parallel.For(0, numConnections,
i =>
{
appendBatches(connections[i], "test-stream-" + i.ToString(), eventBatches);
});
s.Stop();
Console.WriteLine("Elapsed time: " + s.ElapsedMilliseconds + " ms");
Console.WriteLine("Rate: " + ((numEvents * numConnections) / s.Elapsed.TotalSeconds) + " msg/sec");
Console.ReadLine();
}
private static List<EventData[]> createBatches(int numBatches, int eventsPerBatch)
{
List<EventData[]> eventBatches = new List<EventData[]>(numBatches);
for (int i = 0; i < numBatches; i++)
{
EventData[] batch = new EventData[eventsPerBatch];
for (int j = 0; j < eventsPerBatch; j++)
{
int eventNum = (i * eventsPerBatch) + j;
batch[j] = new EventData(Guid.NewGuid(), eventNum.ToString(), false, Encoding.UTF8.GetBytes(eventNum.ToString()), null);
}
eventBatches.Add(batch);
}
return eventBatches;
}
private static void appendBatches(IEventStoreConnection connection, string streamName, List<EventData[]> eventBatches)
{
List<Task> tasks = new List<Task>(eventBatches.Count);
for (int i = 0; i < eventBatches.Count; i++)
{
tasks.Add(connection.AppendToStreamAsync(streamName, ExpectedVersion.Any, eventBatches[i]));
}
Task.WaitAll(tasks.ToArray());
}
}
You mention that you get more events than you wrote how are you measuring that?
Hi Greg,
Two ways: through the browser, and using the ReadStreamEventsForwardAsync TCP API call. Through the browser, I know I wrote 100K events per stream, so I’m able to glance at the number of messages written to a stream and see that there are more there. Reading them back in code, I get the same number of events displayed in the browser, and I’m able to programmatically check for duplicates.
Anne