Unexpected response using http with embed=content

Quintin_Robinson · January 14, 2015, 9:58pm

Hello, we have been writing a nodejs abstraction for interacting with the eventstore via http and have had quite an experience learning the nuances thereof. One of the behaviors that was noticed that we would like to bring into question is how the responses should be structured in a scenario where no data or metadata is present on an event.

Currently if an event has data or metadata present then parsed atom json representation of the response for the entry is something to the effect of

{
title,
id,
updated,
author: { name }
summary,
content: {
eventStreamId,
eventNumber,
eventType,
data,
metadata
},
links: [ { uri, relation } ]
}

``

The information embedded in the content provides us something valuable in these cases; in particular the number, type, data and metadata.

However in the case where neither data or metadata is present the response is significantly different by omitting the content field entirely.

{
title,
id,
updated,
author: { name }
summary,
links: [ { uri, relation } ]
}

``

This inhibits us quite a bit from an event processing standpoint as we lose a lot of value from the context of the event. However we have noticed that some of the content information can be reconstituted from the available atom information if we are so brave as to parse and trust it. For instance we are able to derive the number and stream info by parsing it out of the title field and the event type information by taking it from the summary field.

The primary question this leads us to ask is whether or not this behavior is by design and if so what fields in these objects can we reliably count on to provide us information about the events? Since the content field is only present when an event has been submitted with data or metadata it seems as though we shouldn’t be relying on that field to get the event number, type or stream information from. Conversely it seems counter intuitive to have to parse this information out of the atom fields as they seem quite overloaded for this purpose and there is no promise the format will not change in the future.

Just for a little further context, one of the reasons we use embed mechanism is because we do not see another way to be able to retrieve the metadata for any particular event via http, the only metadata related operations available are at the stream level. If this is a misconception as well could you please indicate what the appropriate approach would be?

Quintin_Robinson · January 16, 2015, 5:10pm

Good news, a coworker has identified that we can obtain the metadata of individual events by using the application/vnd.eventstore.event+json content-type header in contrast to application/json which provides only the event data. Thankfully we can sidestep the original issue posted by following the idiomatic approach of following the atom links for each individual event and specifying the eventstore specific content-type which only requires a bit more extra effort for standard http libraries to parse. All of this being said could you please clarify if the above issue in question is actually by design or an unintentional side effect of a so to speak empty event?

Greg_Young1 · January 16, 2015, 5:12pm

Missed first email.

Yes application/vnd.eventstore.event+json will give you the
information when you do a get on an individual event.

The content however I think is an unintentional effect I will take a
look at it today I hope.

Greg

Greg_Young1 · January 16, 2015, 10:58pm

what am I missing for the test?

      public class reading_content_with_empty_event : HttpBehaviorSpecification
        {
            private string _streamName;
            private JObject _feed;

            protected override void Given()
            {
                var creds = DefaultData.AdminCredentials;
                using (var conn = TestConnection.Create(_node.TcpEndPoint))
                {
                    _streamName = Guid.NewGuid().ToString();
                    conn.ConnectAsync().Wait();
                    conn.AppendToStreamAsync(_streamName,
ExpectedVersion.Any, creds,
                        new EventData(Guid.NewGuid(), "testing", true,
null, null))
                        .Wait();
                }
            }

            protected override void When()
            {
                var uri = MakeUrl("/streams/" + _streamName, "embed=content");
                _feed = GetJson<JObject>(uri.ToString(), ContentType.AtomJson);
            }

{
  "title": "Event stream '06a17c2e-c0fe-4c23-9336-53e9c42411a7'",
  "id": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7",
  "updated": "2015-01-16T22:56:35.0020758Z",
  "streamId": "06a17c2e-c0fe-4c23-9336-53e9c42411a7",
  "author": {
    "name": "EventStore"
  },
  "headOfStream": true,
  "selfUrl": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7",
  "eTag": "0;-1390051759",
  "links": [
    {
      "uri": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7",
      "relation": "self"
    },
    {
      "uri": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7/head/backward/20",
      "relation": "first"
    },
    {
      "uri": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7/1/forward/20",
      "relation": "previous"
    },
    {
      "uri": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7/metadata",
      "relation": "metadata"
    }
  ],
  "entries": [
    {
      "title": "0@06a17c2e-c0fe-4c23-9336-53e9c42411a7",
      "id": "http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7/0",
      "updated": "2015-01-16T22:56:35.0020758Z",
      "author": {
        "name": "EventStore"
      },
      "summary": "testing",
      "content": {
        "eventStreamId": "06a17c2e-c0fe-4c23-9336-53e9c42411a7",
        "eventNumber": 0,
        "eventType": "testing",
        "data": "",
        "metadata": ""
      },
      "links": [
        {
          "uri":
"http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7/0",
          "relation": "edit"
        },
        {
          "uri":
"http://127.0.0.1:45002/streams/06a17c2e-c0fe-4c23-9336-53e9c42411a7/0",
          "relation": "alternate"
        }
      ]
    }
  ]
}

Quintin_Robinson · January 19, 2015, 2:05pm

Good question. I should have mentioned that I was using the binary release of 3.0.1 and I suspect that might have something to do with it. Although I've never consumed the .NET api I still doubt it is doing anything specific to alleviate that. I do have a compiled dev I can test against when I get to work, I don't know the number version but perhaps you can derive that from the git hash it lists:

$ mono EventStore.ClusterNode.exe --version
`EventStore version 0.0.0.0 (dev/e5158ca0e0b0fd99ec665e1b971b117648f8f20b, Mon, 5 Jan 2015 19:50:55 -0800)`

I'll create a simple repro with curl and report back when I'm in the office.

Greg_Young1 · January 19, 2015, 3:04pm

Im reading over http there just writing over client api though on dev. I dont know of any material changes to this though

Quintin_Robinson · January 19, 2015, 5:40pm

I am unable to repro with curl as well but I think maybe I should just describe how I got to the position I am in and maybe that will help you suss it out in the future.

I suspect it was invalid data when importing for testing streams. Basically I noticed an issue when pulling in data where if the field of a json object being stored ends in a CR, LF or both then the projection parser has an issue (only with deserializing the object for projection) and the log emits P: JSON Parsing error: SyntaxError: Unexpected token for every projection instance run. I was able to narrow this down to CR & LF issues by inspecting the data. It should be noted this only affects projections as the event is stored properly and can still be read by other clients. So I started sanitizing my input using dos2unix in order to remove the trailing entries. I think somewhere in this process it generated the corrupted data into the store that lead to the content issue.

FWIW I will include some examples that help repro and identify the CRLF issue but I am not able to reliably repro the content issue and I am going to keep an eye out for exact circumstances so when it comes up again I can report back here.

Simple script to pull down a groomed hosts file and enter every line into the event store (uses uuidgen, might need guid generator substitute for a different OS)

curl http://winhelp2002.mvps.org/hosts.txt | while read line; do curl -i -H “Content-Type: application/json” -H “ES-EventType: host-entry” -H "ES-EventId: uuidgen" http://172.16.185.130:2113/streams/import-host -d “{ “host”: “$line” }”; echo $line ; done

``

This should write ~15.5k events to the store

Retrieving an event works as expected:

curl -i -H “accept: application/vnd.eventstore.event+json” http://172.16.185.130:2113/streams/import-host/279

``

produces:

HTTP/1.1 200 OK
…truncated…
{
“eventStreamId”: “import-host”,
“eventNumber”: 279,
“eventType”: “host-entry”,
“data”: {
“host”: “0.0.0.0 images.adviews.de\r”
},
“metadata”: “”
}

``

However when a projection is run against the events…

fromStream(“import-host”)
.when({
$init: function() {
return {
all: 0,
valid: 0
};
},
$any: function(state, event){
state.all++;
if(event && event.body){
state.valid++;
}
}
})

``

Showing this in the log:

P: JSON Parsing error: SyntaxError: Unexpected token

The result is something to the effect of:

{ "all": 15578, "valid": 0 }

I suspect this is a simple issue with the projection json deserializer but have not perused the code yet to see if I can take a hack at it.

I am disappointed I can't repro the original issue for you but again I will keep an eye out.

As this post has become off topic of the original thread would you like me to put this issue on github instead or perhaps create a separate topic?

Greg_Young1 · January 20, 2015, 1:31pm

"I suspect this is a simple issue with the projection json
deserializer but have not perused the code yet to see if I can take a
hack at it."

We use v8 for deserialization. I just tried writing an event with a
trailing CRLF and it was picked up without issue. Perhaps you can send
the actual json of a failing event?

Quintin_Robinson · January 20, 2015, 4:56pm

The example included in the previous post has the json of a failing event provided (event 279). The entire example has the repro steps used to produce the issue. Running the curl to populate the store as well as the projection code provided consistently produces the issue on both the prebuilt binary 3.0.1 store as well as the 2015.01.05 dev build for me with both instances running on ubuntu linux 14.04 (Linux 3.13.0-32-generic x86_64).

Just for clarification the event is stored and retrievable it is just that the contents of the event are not available within a projection.

I would be happy to help if I can do anything to narrow it down further for you please let me know.

Greg_Young1 · January 20, 2015, 5:21pm

If it’s as simple as a crlf I should be able to just insert one and have the issue no? This doesn’t happen.

From looking you actually have invalid json which I can produce really easily by putting a non escaped lf in the middle of a string. My guess is you are getting tricked because the data in the string gets escaped when you view embedded in content (if we didn’t you would make us produce invalid json!)

Eg

Test{

“Foo”: "bar.

Hand",

“Bar”: “test”

}

Try looking over one of the native Apis and dump the binary of it my guess is you have until escaped chars in the middle of your string which is not allowed in json. The formatted is trying to help you by escaping it when it shows it to you (unless of course I have missed the place where you are escaping \r in your $line that you set to the string?)

Cheers,

Greg

Quintin_Robinson · January 21, 2015, 4:11pm

Yes I think you are right on the money with the data being invalid and that leading to projections failing to parse the body of the event while still being stored inside the eventstore and escaped on retrieval. My understanding
of the file being imported to the eventstore is that it was produced on a windows system and only after running the input through dos2unix as I mentioned would the problem be alleviated so it seems all the pieces fit.

I just have one question and please forgive me if I have a misunderstanding. Since the evenstore is able to store the data and is kind enough to escape it when retrieving the event individually (making it deserializable by our
handlers) would it also be possible to do the same escaping prior to running the projections so that the same data is accessible in projections as are accessible via retrieving the event directly?

Thanks!

Greg_Young1 · January 21, 2015, 4:13pm

We could do this but it would be VERY expensive. We only do anything
like this when you do embed=content (as we need to jam the object into
the feed)

Quintin_Robinson · January 21, 2015, 4:24pm

Okay that is very good information to have, thank you! Ideally malformed data should never make it to the store but all of this exploration has been very educational.

Thanks