HTTP Traversal Problem

Scott_Bellware · July 13, 2015, 7:54pm

Guys,

It seems that if starting traversal from the 0 position, and paging forward in time (or backward, from ATOM’s perspective), there’s no way to know that the last page is indeed the last page without going beyond the head of the stream and checking for entries == [].

I see that there’s a headOfStream field in the ATOM envelop, but it’s always set to false - even when de facto at the head of the stream.

The rel=“previous” link is always present as well. On the defacto last page of a stream, it points to a URI that goes beyond the head of the stream.

If these gets for pages that are beyond the head are cached, then the cache will always have the wrong data, as entries == []. A get for such a page that is resolved by the cache will return a page with no entries.

I’ve tested this a number of ways, with varying amounts of data and various page sizes, and I get the same results consistently.

Greg_Young1 · July 13, 2015, 8:38pm

There is a rel link called "prev" it should disappear when you are caught up.

A quick curl reports (snipped for annoyance) :

{
  "title": "Event stream '$stats-127.0.0.1:2113'",
  "id": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113",
  "updated": "2015-07-13T20:32:59.753821Z",
  "streamId": "$stats-127.0.0.1:2113",
  "author": {
    "name": "EventStore"
  },
  "headOfStream": true,
  "selfUrl": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113",
  "eTag": "176;248368668",
  "links": [
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113",
      "relation": "self"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/head/backward/20",
      "relation": "first"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/0/forward/20",
      "relation": "last"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/156/backward/20",
      "relation": "next"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/177/forward/20",
      "relation": "previous"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/metadata",
      "relation": "metadata"
    }
  ],

Took me a second with curl to catch up with the current (I am a bit
stupid with editing) but eventually got the previous link to work
copy/pasting to email (sorry for the long output). This tells you head
of feed.

Note there is no prev link and its not cacheable.

~/Code/PrivateI/fsharp   master±  curl -v -i
"http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/184/forward/20"
-u admin:changeit -H "Accept: application/json"
* About to connect() to 127.0.0.1 port 2113 (#0)
* Trying 127.0.0.1...
* Adding handle: conn: 0x7fc758803000
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0x7fc758803000) send_pipe: 1, recv_pipe: 0
* Connected to 127.0.0.1 (127.0.0.1) port 2113 (#0)
* Server auth using Basic with user 'admin'

GET /streams/%24stats-127.0.0.1%3A2113/184/forward/20 HTTP/1.1
Authorization: Basic YWRtaW46Y2hhbmdlaXQ=
User-Agent: curl/7.30.0
Host: 127.0.0.1:2113
Accept: application/json

< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Access-Control-Allow-Methods: GET, OPTIONS
Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Headers: Content-Type, X-Requested-With,
X-PINGOTHER, Authorization, ES-LongPoll, ES-ExpectedVersion,
ES-EventId, ES-EventType, ES-RequiresMaster, ES-HardDelete,
ES-ResolveLinkTo, ES-ExpectedVersion
Access-Control-Allow-Headers: Content-Type, X-Requested-With,
X-PINGOTHER, Authorization, ES-LongPoll, ES-ExpectedVersion,
ES-EventId, ES-EventType, ES-RequiresMaster, ES-HardDelete,
ES-ResolveLinkTo, ES-ExpectedVersion
< Access-Control-Allow-Origin: *
Access-Control-Allow-Origin: *
< Access-Control-Expose-Headers: Location, ES-Position
Access-Control-Expose-Headers: Location, ES-Position
< Cache-Control: max-age=0, no-cache, must-revalidate
Cache-Control: max-age=0, no-cache, must-revalidate
< Vary: Accept
Vary: Accept
< ETag: "183;-43840953"
ETag: "183;-43840953"
< Content-Type: application/json; charset=utf-8
Content-Type: application/json; charset=utf-8
* Server Mono-HTTPAPI/1.0 is not blacklisted
< Server: Mono-HTTPAPI/1.0
Server: Mono-HTTPAPI/1.0
< Date: Mon, 13 Jul 2015 20:36:53 GMT
< Content-Length: 909
Content-Length: 909
< Keep-Alive: timeout=15,max=100
Keep-Alive: timeout=15,max=100

<
{
  "title": "Event stream '$stats-127.0.0.1:2113'",
  "id": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113",
  "updated": "0001-01-01T00:00:00Z",
  "streamId": "$stats-127.0.0.1:2113",
  "author": {
    "name": "EventStore"
  },
  "headOfStream": false,
  "links": [
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113",
      "relation": "self"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/head/backward/20",
      "relation": "first"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/0/forward/20",
      "relation": "last"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/183/backward/20",
      "relation": "next"
    },
    {
      "uri": "http://127.0.0.1:2113/streams/%24stats-127.0.0.1%3A2113/metadata",
      "relation": "metadata"
    }
  ],
  "entries": []
* Connection #0 to host 127.0.0.1 left intact
}%

Scott_Bellware · July 13, 2015, 11:27pm

Scratch that… this is a phantom issue. I misunderstood the intended approach.

Also, I’m sorry if my outline of the non-problem cast EventStore in a negative light.

The headOfStream issue is still outstanding. In speaking with James, though, he’s suggested that I not count on it.

[NOTE: There’s a footnote at the bottom of this message that outlines a current malfunction that Greg is aware of.]

I hadn’t paid attention to how cache control works.

I thought that all requests for pages would be cached. I thought that I would read up to the last page with data, and go no further. I thought that if I requested a page beyond the last page, that it would be cached, causing subsequent requests for that page to return a page that has an empty entries list.

In my understanding, my code would check whether to continue getting pages so as to avoid going beyond the last page, and thus avoid the caching of a page of empty entries.

My understanding of EventStore now is that the cache control headers vary based on whether the requested page is the last whole page of data or not.

By “whole page”, I mean that there are as many entries in the page’s entries list as there were requested in the page size parameter in the query.

eg: For a URI of http://127.0.0.1:2113/streams/sendFunds-D9F2EEB1-65AF-4417-8265-68F01B9753AC/39/backward/20, the “20” after the “backward” is the page size.

The last whole page of data is cacheable. The cache control response header is:

Cache-Control: max-age=31536000, public

(a.k.a. 1 year in the future, the recommended maximum for cache age)

For an incomplete page, the result is uncacheable. The cache control response header is:

Cache-Control: max-age=0, no-cache, must-revalidate

By “incomplete page”, I mean that there are less entries in the response than there were requested in the page size parameter in the query.

An incomplete page will render a “rel=previous” link for continuing to traverse the stream (this is “previous” in ATOM reverse-chronology) toward the end (or “beginning”, in ATOM terms).

If the rel=previous link traverses past the end (head) of the stream, that response will also be unchacheable.

The allows the incomplete pages to be requested repeatedly until they are complete, at which point the will become cacheable, and it allows subscriptions to feeds to continue to traverse forward (backward in ATOM terms), caching only whole page responses.

FOOTNOTE: There is presently [Mon Jul 13 2015] an off-by-one error where the calculation of the last complete page results in page not being identified as the last complete page, and causes it to be uncacheable.