DeadlineExceeded on AppendToStream

Hi,

We are trying to migrate one of our production services from MSSQL to ESDB. by running an import using AppendToStreamAsync. However it stops working every 100 events or so by throwing a DeadlineExceeded exception. This is what we’re seeing in the client log:

[16:00:18 INF] Processing PageAddedEventV1(4d2e3b7c-5d99-4ced-af84-44dfa4dc27f7) [03511016]...
[16:00:18 DBG] Starting gRPC call. Method type: 'ServerStreaming', URI: 'https://esdb3.joure1.net:2113/event_store.client.streams.Streams/Read'.
[16:00:18 DBG] Sending message.
[16:00:18 VRB] Serialized 'EventStore.Client.Streams.ReadReq' to 80 byte message.
[16:00:18 VRB] Message sent.
[16:00:18 VRB] Response headers received.
[16:00:18 DBG] Reading message.
[16:00:18 VRB] Deserializing 56 byte message to 'EventStore.Client.Streams.ReadResp'.
[16:00:18 VRB] Received message.
[16:00:18 DBG] Reading message.
[16:00:18 INF] Appending PageAddedEventV1(4d2e3b7c-5d99-4ced-af84-44dfa4dc27f7) to stream pageaggregate-b11dc7e0-bf6d-5bf1-8c03-f218e803ea8a...
[16:00:18 VRB] No message returned.
[16:00:18 DBG] Append to stream - pageaggregate-b11dc7e0-bf6d-5bf1-8c03-f218e803ea8a@Any.
[16:00:18 DBG] Finished gRPC call.
[16:00:18 DBG] Sending message.
[16:00:18 VRB] Serialized 'EventStore.Client.Streams.BatchAppendReq' to 2148 byte message.
[16:00:18 VRB] Message sent.
[16:00:21 VRB] Deserializing 143 byte message to 'EventStore.Client.Streams.BatchAppendResp'.
[16:00:21 VRB] Received message.
[16:00:21 DBG] Reading message.
[16:00:51 ERR] Processing failed
System.AggregateException: One or more errors occurred. (Status(StatusCode="DeadlineExceeded", Detail="Timeout"))
 ---> Grpc.Core.RpcException: Status(StatusCode="DeadlineExceeded", Detail="Timeout")
   at EventStore.Client.Streams.BatchAppendResp.ToWriteResult()
   at EventStore.Client.EventStoreClient.StreamAppender.Receive()
   at EventStore.Client.EventStoreClient.StreamAppender.AppendInternal(Options options, IEnumerable`1 events, CancellationToken cancellationToken)
   at EventStore.Client.EventStoreClient.AppendToStreamAsync(String streamName, StreamState expectedState, IEnumerable`1 eventData, Action`1 configureOperationOptions, Nullable`1 deadline, UserCredentials userCredentials, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at Program.<Main>$(String[] args) in ...\Program.cs:line 175

We’ve been checking the flow for every imported events, but there are no significant differences. It just stops after reading the BatchAppendResp.

Our code has been tested already in a staging environment. Here we were able to import over 900K events without any exceptions.

Program.cs

        try
        {
            var events = client.ReadStreamAsync(Direction.Forwards, streamName, StreamPosition.Start);
            if (events.AnyAsync(a => a.OriginalEvent.EventId == eventId).Result)
            {
                logger.LogWarning($"Skipping {messageName}({messageId}) [{sequenceId:00000000}] as it already has been appended...");
                continue;
            }
        }
        catch
        {
            //***** Do nothing;
        }
        
        //*****
        var eventData = new EventData[] { new(eventId, messageName, data, headers, mediaType) };
        logger.LogInformation($"Appending {messageName}({messageId}) to stream {streamName}...");
        client.AppendToStreamAsync(streamName, StreamState.Any, eventData).Wait();
        logger.LogInformation($"{messageName}({messageId}) appended to stream {streamName}");

We are running ESDB 21.10.1.0 with client version 22.0.0.

Any thoughts?

Cheers,

Peter

We’ve just noticed a difference in server version between staging and production. Staging was running on version 21.10.2.0 and production on 21.10.1.0. Just updated production to 21.10.2.0 and seemed to work like a charm. But after 11K events or so, a deadline exception was thrown again. The problem is that the client connection is lost after that. As the client is a singleton, it requires the service to restart. Is there another approach possible for the client allowing a more resilient behaviour? Is singleton required, or could it be a per request connection?

We wrote a wrapper around AppendToStreamAsync. It seems like DeadlineExceeded can be considered a transient exception. This is the wrapper code. Perhaps it can be helpful for someone:

/// <summary>
/// Helper methods for the <see cref="EventStoreClient" />.
/// </summary>
public static class EventStoreClientHelper
{
    /// <summary>
    /// Try to append event data to a stream within a certain amount of retries in case of deadline exceptions returning failure if exceeded.
    /// </summary>
    /// <param name="client"></param>
    /// <param name="streamName"></param>
    /// <param name="eventData"></param>
    /// <param name="retries"></param>
    /// <returns></returns>
    public static OperationResult AppendToStream(this EventStoreClient client, string streamName, EventData[] eventData, int retries)
    {
        var attempt = 1;
        while (attempt <= retries)
        {
            //*****
            try
            {
                return OperationResult.CreateSuccessResult(client.AppendToStreamAsync(streamName, StreamState.Any, eventData).Result.NextExpectedStreamRevision);
            }
            catch (AggregateException aex)
            {
                //***** Retry on DeadlineExceeded;
                if (aex.InnerException is not RpcException { StatusCode: StatusCode.DeadlineExceeded })
                    return OperationResult.CreateFailure(aex.InnerException ?? aex);
            }
            catch (RpcException rpcex)
            {
                //***** Retry on DeadlineExceeded;
                if (rpcex.StatusCode != StatusCode.DeadlineExceeded)
                    return OperationResult.CreateFailure(rpcex);
            }
            catch (Exception ex)
            {
                return OperationResult.CreateFailure(ex);
            }

            //*****
            attempt++;
        }

        //*****
        return OperationResult.CreateFailure("Attempts failed");
    }

    /// <summary>
    /// Try to append event data to a stream within a certain amount of retries in case of deadline exceptions returning failure if exceeded.
    /// </summary>
    /// <param name="client"></param>
    /// <param name="streamName"></param>
    /// <param name="expectedRevision"></param>
    /// <param name="eventData"></param>
    /// <param name="retries"></param>
    /// <returns></returns>
    public static OperationResult AppendToStream(this EventStoreClient client, string streamName, StreamRevision expectedRevision, EventData[] eventData, int retries)
    {
        var attempt = 1;
        while (attempt <= retries)
        {
            //*****
            try
            {
                return OperationResult.CreateSuccessResult(client.AppendToStreamAsync(streamName, expectedRevision, eventData).Result.NextExpectedStreamRevision);
            }
            catch (AggregateException aex)
            {
                //***** Retry on DeadlineExceeded;
                if (aex.InnerException is not RpcException { StatusCode: StatusCode.DeadlineExceeded })
                    return OperationResult.CreateFailure(aex.InnerException ?? aex);
            }
            catch (RpcException rpcex)
            {
                //***** Retry on DeadlineExceeded;
                if (rpcex.StatusCode != StatusCode.DeadlineExceeded)
                    return OperationResult.CreateFailure(rpcex);
            }
            catch (Exception ex)
            {
                return OperationResult.CreateFailure(ex);
            }

            //*****
            attempt++;
        }

        //*****
        return OperationResult.CreateFailure("Attempts failed");
    }
}

How many events are in the batch? Have you tried to increase the deadline timeout?

Hi Alexey, single event batches. Changed deadline you to 1 minute. Strange thing it happens with 21.10.1.0 only. 21.10.2.0 seems to work without noticable issues (used wrapper as well) Also the max attempts was exceeded at some point. So the wapper makes limited sense.