Subscription "skips" events

Ah I understand, thanks for clearing that up. Unfortunately in this case the lastProcessedVersion + 1 check is the only thing preventing an inconsistent data model (because of skipped events).

Is this enough information for further investigation or do I need to test something in particular?

Regards,

Nicolas

We will have to try to get a reproduction of it before I can comment
much further on it.

Okay, let me know if it’s possible to reproduce it on your end with the repro code I sent.

I have not been able to reproduce this running against dev on linux.

Just curious...

Can you put up your code for "checking the event numbers"?

13:30:31.534 Fri Feb 20 12:30:31.4980 +00:00 2015 ProjectionHost
(MyHostname) : Trace, Catch-up Subscription to context-forum: event
appeared (context-forum, 4, $> @ 427009/427009).
13:30:31.534 Fri Feb 20 12:30:31.4980 +00:00 2015 ProjectionHost
(MyHostname) : Trace, Catch-up Subscription to context-forum: event
appeared (context-forum, 5, $> @ 430400/430400).
13:30:31.535 Fri Feb 20 12:30:31.4980 +00:00 2015 ProjectionHost
(MyHostname) : Trace, Catch-up Subscription to context-forum: event
appeared (context-forum, 6, $> @ 430637/430637).
13:30:31.535 Fri Feb 20 12:30:31.4980 +00:00 2015 ProjectionHost
(MyHostname) : Info, PersistentProjectionModule "forum": EventAppeared
with eventNumber: 4, Thread ID: 13
13:30:31.535 Fri Feb 20 12:30:31.4980 +00:00 2015 ProjectionHost
(MyHostname) : Trace, PersistentProjectionModule "forum": Event
appeared of type "ForumCreatedEvent", received: 4, expected: 2, used
subscription: 7340066
13:30:31.535 Fri Feb 20 12:30:31.4980 +00:00 2015 ProjectionHost
(MyHostname) : Error, PersistentProjectionModule "forum": Error during
projection processing, missed version, received: 4, expected: 2,
previous: 1 FatalErrorCritical

These are linkto events. What "event number" are you checking? They
are ordered by the LINKTO's event number no the resolved events event
number

var expectedVersion = _previousEventNumber + 1;

var receivedVersion = resolvedEvent.Link.EventNumber;

if (receivedVersion < expectedVersion)

{

Logger.Warn(“VolatileProjectionModule “{0}”: Skipping already processed event with version {1}, expected {2}”,

_contextKey, receivedVersion, expectedVersion);

return;

}

if (receivedVersion > expectedVersion)

throw new EventMissedException(expectedVersion, receivedVersion);

``

I already use the linkto event number as the used projection is for the whole bounded context stream, so only the bounded context id is relevant for us.

We are catching this EventMissedException and re-create the subscription, which “mostly” works fine, but it’s a pretty heavy workaround and occasionally leads to new bugs that would’ve not occured otherwise.

What I find note-worthy, every time these missed events happen, the first “wrong” call to EventAppeared is also called under a different Thread.

These appear in three different Azure environments (Production, Staging, Dev) in several similar but not identical components (Projection processing, “Workflow”/EventHandler processing and generic subscriptions).

As this happened quite a lot the last few days, I’d like to remind that this is still a critical issue for us.

The following scenario occured a few days ago (numbers X-ed for clarity):

EventAppeared invoked for eventNumber XXXX718, Thread ID 12, entered lock.

… processing begins for XXXX718

EventAppeared invoked for eventNumber XXXX728, Thread ID 37 (!), waits because lock is still being held by processing of XXXX718

… processing finished for XXXX718

EventAppeared invoked for eventNumber XXXX719, Thread ID 12, waits because lock was entered by the invalid invocation of XXXX728 after XXXX718 finished

… processing begins for XXXX728 which throws an exception because XXXX728 is not the next event number after XXXX718 (obviously).

These invocations were coming from the same subscription (checked via hash code) and were only milliseconds apart, so no timeout etc which could result in another invocation etc.

So…the same issue remains, why does the subscription just invoke a seemingly random event number under a different thread while the previous processing is still in process?

As mentioned previously the best way of getting people to spend time
looking at this issue (that only you appear to be having which leads
me to believe its not a main line case) is to provide a test that does
not use any of your code.

As previously stated two emails ago:

"I have not been able to reproduce this running against dev on linux."

Without being able to reproduce there is zero chance of anyone fixing
your issue.

Cheers,

Greg