Incomparable multi-stream checkpoint tags after upgrading to 4.1.0

Hi Folks

After upgrading a 3 node cluster from 4.0.3 to 4.1.0 on Ubuntu 14.04.5 LTS, with an approximate disk usage of 519G, we started receiving the below error on one of our projections. This projection was at 100% before the upgrade and at the end of the 3rd subscribed stream. Is this a bug and If so are there any potential workarounds?

Incomparable multi-stream checkpoint tags. '$ce-RatingRule: 9217;
$ce-RuleResults2b: 231657210; $ce-RuleResults3a: 186523; ’ and
'$ce-RatingRule: 9203; $ce-RuleResults2b: 231657210; $ce-RuleResults3a:
10265619; ’

at
EventStore.Projections.Core.Services.Processing.CheckpointTag.ThrowIncomparable
(EventStore.Projections.Core.Services.Processing.CheckpointTag left,
EventStore.Projections.Core.Services.Processing.CheckpointTag right)
[0x0000c] in :0
at
EventStore.Projections.Core.Services.Processing.CheckpointTag.op_GreaterThan
(EventStore.Projections.Core.Services.Processing.CheckpointTag left,
EventStore.Projections.Core.Services.Processing.CheckpointTag right)
[0x002af] in :0
at
EventStore.Projections.Core.Services.Processing.EmittedStream.SubmitWriteEventsInRecovery
() [0x00019] in :0
at
EventStore.Projections.Core.Services.Processing.EmittedStream.ProcessWrites
() [0x00061] in :0
at
EventStore.Projections.Core.Services.Processing.EmittedStream.EmitEvents
(EventStore.Projections.Core.Services.Processing.EmittedEvent[] events)
[0x00124] in :0
at
EventStore.Projections.Core.Services.Processing.ProjectionCheckpoint.EmitEventsToStream
(System.String streamId,
EventStore.Projections.Core.Services.Processing.EmittedEventEnvelope[]
emittedEvents) [0x00143] in :0
at
EventStore.Projections.Core.Services.Processing.ProjectionCheckpoint.ValidateOrderAndEmitEvents
(EventStore.Projections.Core.Services.Processing.EmittedEventEnvelope[]
events) [0x00051] in :0
at
EventStore.Projections.Core.Services.Processing.CoreProjectionCheckpointManager.EventsEmitted
(EventStore.Projections.Core.Services.Processing.EmittedEventEnvelope[]
scheduledWrites, System.Guid causedBy, System.String correlationId)
[0x0007e] in :0
at
EventStore.Projections.Core.Services.Processing.ResultWriter.EventsEmitted
(EventStore.Projections.Core.Services.Processing.EmittedEventEnvelope[]
scheduledWrites, System.Guid causedBy, System.String correlationId)
[0x00000] in :0
at
EventStore.Projections.Core.Services.Processing.EventSubscriptionBasedProjectionProcessingPhase.FinalizeEventProcessing
(EventStore.Projections.Core.Services.Processing.EventProcessedResult
result, EventStore.Projections.Core.Services.Processing.CheckpointTag
eventCheckpointTag, System.Single progress) [0x00051] in
:0
at
EventStore.Projections.Core.Services.Processing.CommittedEventWorkItem.WriteOutput
() [0x00035] in :0
at EventStore.Projections.Core.Services.Processing.WorkItem.Process
(System.Int32 onStage, System.Action`2[T1,T2] readyForStage) [0x000a6]
in :0
at
EventStore.Projections.Core.Services.Processing.StagedProcessingQueue.ProcessEntry

(EventStore.Projections.Core.Services.Processing.StagedProcessingQueue+TaskEntry
entry) [0x00037] in :0
at
EventStore.Projections.Core.Services.Processing.StagedProcessingQueue.Process
(System.Int32 max) [0x00027] in
:0
at
EventStore.Projections.Core.Services.Processing.CoreProjectionQueue.ProcessOneEventBatch
() [0x0001c] in :0
at
EventStore.Projections.Core.Services.Processing.CoreProjectionQueue.ProcessEvent
() [0x00013] in :0
at
EventStore.Projections.Core.Services.Processing.EventSubscriptionBasedProjectionProcessingPhase.ProcessEvent
() [0x00000] in :0
at EventStore.Projections.Core.Services.Processing.CoreProjection.Tick
() [0x00032] in :0

``

Cheers,
Brett

Hi Brett,

When a projection starts up it reads all the events from after the last checkpoint, up to where it last emitted to.

When doing this, it compares the metadata between the event that is physically there, and the one the projection would have emitted.

This error means that there is a mismatch between the metadata on these events.

With regards to fixing this, there are two steps you can take to try fix this issue :

  1. Restart the master node, which forces the projections to restart. This will only work if the issue is some form of inconsistent state within the projections subsystem.

  2. Reset the projection. This will rebuild the projection from the start.

To help us determine whether this issue is specific to the new version, would you be able to try running this same db on the previous version and seeing if you still get this?

You can do this either by restoring a backup from before the upgrade, or running the node with the previous version and allowing any new indexes to rebuild.

Could you also tell us whether this happens immediately as the node starts, or is there some time between the start and the error?

Thank you

Hi Hayley

Thanks for the reply, will take a bit to test the version rollback scenario but in the meantime:

  1. Restarting the master node does not fix the issue.

  2. Resetting the projection is something we are trying to avoid given how long it would take. Leaving this until a last resort.

Will let you know how rollback goes.

Cheers,

Brett