partitionBy

Would be appreciated, I want to update those blog entries (and I want to run these darned projections over the data ( > half a million events on my server now) and see what a proper statistically relevant sample looks like :D)

Does this work yet?

I really want to do a pile of processing on my last.fm data, I thought of some wicked temporal queries

Rob,

the only significant change is that you need outputTo(,) to persist results. Otherwise results stay in memory and are only available via http requests.

-yuriy

I was using outputTo, I was only getting 33 results (one for each stream with a count of 1)

@yuriy see thread. It appears to be a bit of a bug. With outputTo it seems the query was not seeing events in stream.

Do you have this commit in your sources?

86819bff16ad106324863a93ec0d15a7b0076474

Anyway, Let me run the very similar projection to check.

apparently I did not - I’ll give it another whirl then. (I updated the day you told me it had been fixed, had you not merged to dev yet?)

Anyway no matters, I’ll report back with hopeful success :slight_smile:

Oh, I see that was yesterday evening and after the thread - okay - fine, well I’ll definitely be reporting back then!

It fixes probably different issue, but if you got just previous commit you have completely broken fromStream/fromCategory.

Rob,

is is possible that you tried to output into the same stream/streams from different projection? i.e. you first created output streams from one projection and then attempted to write to these streams from another projection?

if so, it is possible that the second projection ignored output to this stream.

I’m adding an option to restart projections (in development) and it will address this problem as well.

-yuriy

Probably not, see the sequential numbers?

It’s most likely I just had the buggy build, I’ll try again in an hour or so now I’ve done a pull.

An option to restart projections would be great for development

My plan for production is to version them and disable projections of a previous version if I don’t need them any more - does that sound reasonable?

I tried

fromCategory().
foreachStream().
when().

outputTo ()

and I can see more than just one event handled in different streams (on current dev)

great, I can’t wait to see it for myself :slight_smile:

Rob,

I foresee a problem if you run one projection with

outputTo(‘test1’, ‘test1-{0}’)

then disable it and run another projection with the same

outputTo(‘test1’, ‘test1-{0}’)

At least with current implementation it will either fail or do not write results until the new projection reaches the same position as the disabled one. (Fail if the second projection reads from incompatible source).

Restarting will make sure that we ignore any contents of any scream written by the previous version of the projection.

-yuriy

So I’d have to version the output streams too - that sounds reasonable.

I’m thinking algorithm improvements and such for outputs over time.

This really isn’t working for me, I’ve pulled latest dev and wrote the following two projections

fromStream(‘github’)

.when({

PushEvent: function(s, e) {

linkTo(‘pelanguage-’ + e.body.repo.language, e)

}

})

And

fromCategory(‘pelanguage’)

.foreachStream()

.when({

$init: function() { return { count: 0 } },

PushEvent: function(s, e) {

s.count++

}

}).outputTo(‘totaleventsbylanguage’, ‘totaleventsbylanguage-{0}’)

The result is I end up with a load of streams created with pelanguage-LANGUAGE and the fromCategory stream does absolutely nothing at all.

What am I doing wrong?

This is on a completely fresh set of data, I rm -rfed the entire data directory and logs directory before running this isolated test.

I’m confused though, I appear to have some artifacts left over from the last test - is there something else I have to delete other than the data directory?

There are warnings appearing in the log due to faulty de-serialisations, is this relevant?