EventStore in a worker role, Azure, blob

I’m playing around with using EvenStore in a worker role. I’m sure there are many out there doing this. How is it done best?

I have eventstore in a zip-file as a blob in storage account

My idea is that a worker role downloads the zipfile in the OnStart (or maybe in a StartUp Task?) and installs EventStore. As db it should then use already existing files in the blob storage.

Can I get this to work? Will EventStore write to the db if it’s in blob storage?

Thanks

Seems like I can answer this myself. Got error at startup trying to specify a blob folder as the db.
Exited reason: URI formats are not supported.

What I don’t understand is, where is the actual db of ES placed when installing in worker role? Assuming it has to be persistent, and not disappear when instance is recycled…

I’ve never attempted to do this myself; I only have experience running EventStore in VMs, which works well once you understand azure networking and the constraints of azure storage.

If you want to go down this path, I think you may want the Azure File Service which is in preview:

http://blogs.msdn.com/b/windowsazurestorage/archive/2014/05/12/introducing-microsoft-azure-file-service.aspx

This article looks to describe how to mount such a drive in cloud service:

http://fabriccontroller.net/blog/posts/using-the-azure-file-service-in-your-cloud-services-web-roles-and-worker-role/

Like I said, I have no experience with ES in an azure worker role, though it seems like you could get it working, and no experience with azure file service or its performance characteristics. I’ll be curious to learn how you fare.

There was an older service that let you mount a blob based VHD in a worker role but that never made it out of beta and is no longer supported, AFAIK.

HTH,

Brian

That is a great idea, thank you very much!
I started right away with this, I’ll get back here with results.

So, the idea:

A mapped drive is mounted on File Service.

3 instances of a WR with one folder each on that drive. At startup they will download the zip-file with ES from the blob storage, unzip and install it to localstorage, except for the db that goes into the instance’s folder on the mapped drive.

The gossip part though… How will ES on 3 instances get to gossip? There is the TCP direct connection, but the gossip is with http only I think…

I don’t know azure that well but what would be the issue with http that’s ok with tcp? Http is implemented on top of tcp

That could be a question to a post I wrote just now in DDD/CQRS, so to that: Yeah I know. Well, you’re right not a huge issue, but since there is latency in every little step, cutting down on it would be good.

For the question in this thread I’m thinking about when I launch the ClusterNode.exe with parameters, and seeding it for gossip. It only works seeding with http ports, right?

So, if the connection between the instances is with TCP, I’m not sure how to get the ES’s on them to gossip.

well, I’ll just try :slight_smile: setting the internal endpoint to use http.

I. Completely confused by what you are even trying to do here or why…

There is tcp and http.

Why would http not work and tcp work?

I’ve set up the file share as described by Sandrino.

With Azure powershell I’ve created three folders, one for each instance’s es db.

At worker role startup the ES.zip that I had in the blob storage, is downloaded and extracted it to localstorage of the worker role.
I’m now getting the mounting of the mapped drive to work, so that I can run the ClusterNode.exe with the db path set to that instance’s specific folder on the mapped drive.

So, the idea is that I will get a cluster like this. Now, obviously, unless these instances are on different racks, this clustering won’t ensure any HA. But I’m getting there.

Nevermind the http and tcp. It’s just that I don’t know yet if I’ll get the gossip to work between the es-instances on the three worker role-instances.

I thought this could be a way to run es in azure. I might be doing it awkwardly, I won’t mind if you tell me so :slight_smile:

"Nevermind the http and tcp. It's just that I don't know yet if I'll
get the gossip to work between the es-instances on the three worker
role-instances."

Why would gossip not work but tcp would?

Well, that’s my bad, the seeding uses tcp endpoints too right?
I guess it should work then, I overcomplicated the problems.

Anyway, If I get this File Service + worker role + es to work, I know I for sure will be happy :slight_smile:

No seeding/gossip uses http. I just fail to see why TCP would work and
http would not for some reason?

Mm… I had to dig back in the conversation to see were things went wrong. I think this is were I left something out, triggering that question of yours.
“The gossip part though… How will ES on 3 instances get to gossip? There is the TCP direct connection, but the gossip is with http only I think…”

… should be:

“… There is the TCP direct connection [between worker role instances] …”

I’m totally new with ES, so I’m not telling the one who built it how it works :smiley: I’m just trying to figure out how it could be setup on WR-instances of the same role, and get the gossip going.

To me it seems like a hurdle that the gossip uses http, if the communication between roles is with tcp. I don’t know what ace you have up your sleeve though, because it almost seems like there is a dead easy solution to this that you know about, but aren’t telling :stuck_out_tongue:

I don't know worker roles at all but if you can get the tcp
communications working between nodes I don't see what the issue would
be with the http communications?

We’ll see how it goes.

I’ve gotten the File Service working, and installed ES on LocalStorage on the instance while putting db in the file service share (mapped to z:\ )

I’ve set up the worker role InternalEndpoint to 2113.

What I get is this:

[02676,01,16:52:46.142]
ES VERSION: 3.0.1.0 (release-oss-v3.0.1/7fa876c111888dd5980dbd86d126e6abe13b05ab, Thu,
23 Oct 2014 22:27:04 +0100)
OS: Windows (Microsoft Windows NT 6.2.9200.0)
RUNTIME: .NET 4.0.30319.34014 (64-bit)
GC: 3 GENERATIONS
LOGS: Z:\ClusterOne\logs

HELP: False ()
VERSION: False ()
LOG: Z:\ClusterOne\logs (Command Line)
CONFIG: ()
DEFINES: ()
WHAT IF: False ()
INT IP: 127.0.0.1 ()
EXT IP: 127.0.0.1 ()
INT HTTP PORT: 2112 ()
EXT HTTP PORT: 2113 ()
INT TCP PORT: 1112 ()
INT SECURE TCP PORT: 0 ()
EXT TCP PORT: 1113 ()
EXT SECURE TCP PORT: 0 ()
INT TCP HEARTBEAT TIMEOUT: 700 ()
EXT TCP HEARTBEAT TIMEOUT: 1000 ()
INT TCP HEARTBEAT INTERVAL: 700 ()
EXT TCP HEARTBEAT INTERVAL: 2000 ()
FORCE: False ()
CLUSTER SIZE: 1 ()
NODE PRIORITY: 0 ()
MIN FLUSH DELAY MS: 2 ()
COMMIT COUNT: -1 ()
PREPARE COUNT: -1 ()
ADMIN ON EXT: True ()
STATS ON EXT: True ()
GOSSIP ON EXT: True ()
DISABLE SCAVENGE MERGING: False ()
DISCOVER VIA DNS: True ()
CLUSTER DNS: fake.dns ()
CLUSTER GOSSIP PORT: 30777 ()
GOSSIP SEED: ()
STATS PERIOD SEC: 30 ()
CACHED CHUNKS: -1 ()
CHUNKS CACHE SIZE: 536871424 ()
MAX MEM TABLE SIZE: 1000000 ()
DB: Z:\ClusterOne\db (Command Line)
MEM DB: False ()
SKIP DB VERIFY: False ()
RUN PROJECTIONS: System ()
PROJECTION THREADS: 3 ()
WORKER THREADS: 5 ()
HTTP PREFIXES: ()
ENABLE TRUSTED AUTH: False ()
CERTIFICATE STORE LOCATION: ()
CERTIFICATE STORE NAME: ()
CERTIFICATE SUBJECT NAME: ()
CERTIFICATE THUMBPRINT: ()
CERTIFICATE FILE: ()
CERTIFICATE PASSWORD: ()
USE INTERNAL SSL: False ()
SSL TARGET HOST: n/a ()
SSL VALIDATE SERVER: True ()
AUTHENTICATION TYPE: internal ()
PREPARE TIMEOUT MS: 2000 ()
COMMIT TIMEOUT MS: 2000 ()
UNSAFE DISABLE FLUSH TO DISK: False ()
GOSSIP INTERVAL MS: 1000 ()
GOSSIP ALLOWED DIFFERENCE MS: 60000 ()
GOSSIP TIMEOUT MS: 500 ()

[02676,01,16:52:46.454] Quorum size set to 1
[02676,01,16:52:46.454] Can’t find plugins path: C:\Resources\directory
\ae44860629ce410cb4114efd6d9fa7a1..EventStore.StartupLocalStorage\plugins
[02676,01,16:52:46.470]
INSTANCE ID: 44138522-d284-4289-98f3-477562e60c8a
DATABASE: Z:\ClusterOne\db
WRITER CHECKPOINT: 0 (0x0)
CHASER CHECKPOINT: 0 (0x0)
EPOCH CHECKPOINT: -1 (0xFFFFFFFFFFFFFFFF)
TRUNCATE CHECKPOINT: -1 (0xFFFFFFFFFFFFFFFF)

[02676,01,16:52:46.720] MessageHierarchy initialization took 00:00:00.1285538.
[02676,01,16:52:46.986] CACHED TFChunk #0-0 (chunk-000000.000000) in 00:00:00.0076401.
[02676,01,16:52:47.454] Starting MiniWeb for /web/es/js/projections ==> C:\Resources\directory
\ae44860629ce410cb4114efd6d9fa7a1..EventStore.StartupLocalStorage\projections
[02676,01,16:52:47.454] Starting MiniWeb for /web/es/js/projections/v8/Prelude ==> C:\Resources
\directory\ae44860629ce410cb4114efd6d9fa7a1..EventStore.StartupLocalStorage\Prelude
[02676,01,16:52:47.454] Starting MiniWeb for /web/es/js/projections/resources ==> C:\Resources
\directory\ae44860629ce410cb4114efd6d9fa7a1..EventStore.StartupLocalStorage\web-resources\js
[02676,01,16:52:47.454] Binding MiniWeb to /web/es/js/projections/{*remaining_path}
[02676,01,16:52:47.454] Binding MiniWeb to /web/es/js/projections/v8/Prelude/{*remaining_path}
[02676,01,16:52:47.454] Binding MiniWeb to /web/es/js/projections/resources/{*remaining_path}
[02676,01,16:52:47.501] Starting MiniWeb for /web ==> C:\Resources\directory
\ae44860629ce410cb4114efd6d9fa7a1..EventStore.StartupLocalStorage\clusternode-web
[02676,01,16:52:47.501] Binding MiniWeb to /web/{*remaining_path}
[02676,01,16:52:47.517] Starting MiniWeb for /web/users ==> C:\Resources\directory
\ae44860629ce410cb4114efd6d9fa7a1..EventStore.StartupLocalStorage\Users\web
[02676,01,16:52:47.517] Binding MiniWeb to /web/users/{*remaining_path}
[02676,10,16:52:47.564] ========== [127.0.0.1:2112] SYSTEM INIT…
Exiting with exit code: 1.
Exit reason: Http async server failed to start listening at [http://127.0.0.1:2113/].
[02676,11,16:52:47.626] TableIndex initialization…
[02676,11,16:52:47.673] ReadIndex building…
[02676,11,16:52:47.673] ReadIndex rebuilding done: total processed 0 records, time elapsed: 00:00:00.
[02676,10,16:52:47.736] SLOW BUS MSG [MainBus]: SystemInit - 78ms. Handler: StorageChaser.
[02676,10,16:52:47.799] Starting Normal TCP listening on TCP endpoint: 127.0.0.1:1113.
[02676,10,16:52:47.829] Starting HTTP server on [http://127.0.0.1:2113/]…
[02676,10,16:52:47.845] Attempting to add permissions for http://127.0.0.1:2113/ using netsh http add
urlacl url=http://127.0.0.1:2113/ user=“WORKGROUP\RD00155D621ECF$”
[02676,10,16:52:48.720] Retrying HTTP server on [http://127.0.0.1:2113/]…
[02676,10,16:52:48.720] Failed to start http server
Access is denied
[02676,10,16:52:48.767] Exiting with exit code: 1.
Exit reason: Http async server failed to start listening at [http://127.0.0.1:2113/].
[02676,06,16:52:49.126] SLOW QUEUE MSG [MonitoringQueue]: SystemInit - 1125ms. Q: 0/0.

So, this I pretty good. I think it’s some permission issue hindering the http server setup.

No permissions to start a http server

ServiceDefinition.csdef - >

That’s it :slight_smile:

So, tomorrow I’ll start getting the other nodes up.

Questions to solve:

  • Graceful shutdown

  • Proper reinstallation upon recycling

  • Latency

among other things :slight_smile:

Graceful shurdown : we have made it a point to not depend on this.

True, I’ll try not to depend on that.

How would I query each instance to know if it is master?
Or better yet, can I eavesdrop the gossip to always stay updated on which one is the master?
I guess that’s what the Manager is for. But without it then?

Ok, so, really dumb question here…:

When connecting to a cluster, it says it needs to be done with URI like this
discover://user:password@myserver:1234

so, discover://user:[the password for my server]:1234
or discover://user:[password]@[myserver]:1234

and which server are we talking about here?

"Or better yet, can I eavesdrop the gossip to always stay updated on
which one is the master?"

http GET /gossip

Cheers,

Greg