Is there a clever way to set up clustered eventstore with nginx on docker?

I’m playing around with eventstore clustering, docker and nginx to lear me some more docker and nginx. In the end I want to have an eventstore cluster running in a docker swarm using compose, but I first start with a single host to since linking between hosts isn’t allowed in docker compose at the moment.

I’m sort of new to nginx as well so I might do multiple things wrong, but I do get one host running. The problem is when I try to create the cluster, I can’t get the nodes to find each other. My plan was to set up a nginx as a reverse proxy in front of the three nodes and then just use the IP of the load balancer as dns name in the node configuration. The code as it looks now can been seen here: https://github.com/mastoj/dockerplayground/tree/14f15a4a54809406cfe267a6589e6e27e0c537a8

If anyone has any suggestion to how I can get that cluster running I would be really glad.

Thanks,

–Tomas

We don’t recommend running in Docker - it gains you nothing for a database (unless you happen to be using Linux-branded zones on Joyent’s cloud). Can you tell us more about what you’re trying to achieve with this?

Nginx just needs configuring as a reverse proxy, there are examples for Varnish which you can probably convert in the docs.

As of know I’m just playing with it to learn more nginx, docker and eventstore clustering, but why don’t you recommend using docker? I see some advantages like uniform setup across all environments in a simple way, simple configuration, and fun :). Is there a performance hit that you don’t like? It most definitely is useful for setting up development environments at least.

I did try to set nginx up as a reverse proxy, it worked fine for one node but when adding more nodes the nodea fail to discover each other. The nginx config is this one https://github.com/mastoj/dockerplayground/blob/master/nginx/nginx confirm and the command to run the nodes can be found in this dockerfile https://github.com/mastoj/dockerplayground/blob/master/eventstore/Dockerfile

–Tomas

We don’t recommend running in Docker - it gains you nothing for a database (unless you happen to be using Linux-branded zones on Joyent’s cloud). Can you tell us more about what you’re trying to achieve with this?

Nginx just needs configuring as a reverse proxy, there are examples for Varnish which you can probably convert in the docs.

My guess the error is me not understanding how nginx works 100%. First I thought the problem was that I hadn’t set the internal ip, but that’s now fixed for all the nodes but the still can’t connect to each other. I also think it might because I don’t have a proper dns, instead I try to use nginx to fill that gap without luck. I get a “Gossip send failed” for all nodes, which I guess means they can’t find each other through the reverse proxy I’ve set up.

One more reason to why use docker, it’s good demonstration for what you need to set up a cluser.

If anyone have time to have a quick look it would be much appreciated: https://github.com/mastoj/dockerplayground

Some log messages I got when running:

esnode1_1 | [00011,11,21:43:38.843] Subscriptions received state change to Unknown stopping listening.

esnode1_1 | [00011,11,21:43:38.843] SLOW BUS MSG [PersistentSubscriptionsBus]: BecomeUnknown - 76ms. Handler: PersistentSubscriptionService.

esnode1_1 | [00011,24,21:43:38.855] ELECTIONS: STARTING ELECTIONS.

esnode1_1 | [00011,24,21:43:38.855] ELECTIONS: (V=0) SHIFT TO LEADER ELECTION.

esnode1_1 | [00011,24,21:43:38.855] ELECTIONS: (V=0) VIEWCHANGE FROM [172.17.0.107:2112, {f1297dca-308a-4f44-ba3b-102207b5220c}].

esnode1_1 | [00011,24,21:43:38.867] SLOW BUS MSG [MainBus]: StartElections - 63ms. Handler: ElectionsService.

esnode1_1 | [00011,24,21:43:38.867] SLOW QUEUE MSG [MainQueue]: StartElections - 63ms. Q: 1/3.

esnode1_1 | [00011,24,21:43:38.913] Looks like node [192.168.99.101:2113] is DEAD (Gossip send failed).

esnode1_1 | [00011,24,21:43:38.913] CLUSTER HAS CHANGED (gossip send failed to [192.168.99.101:2113])

esnode1_1 | Old:

esnode1_1 | MAN {00000000-0000-0000-0000-000000000000} [Manager, 192.168.99.101:2113, 192.168.99.101:2113] | 2015-10-06 21:43:38.713

esnode1_1 | VND {f1297dca-308a-4f44-ba3b-102207b5220c} [Unknown, 172.17.0.107:1112, n/a, 172.17.0.107:1113, n/a, 172.17.0.107:2112, 172.17.0.107:2113] 7520503/7521001/7521001/E13@7504579:{f6efef66-e455-4b23-af0f-926eb97750b0} | 2015-10-06 21:43:38.894

esnode1_1 | New:

esnode1_1 | MAN {00000000-0000-0000-0000-000000000000} [Manager, 192.168.99.101:2113, 192.168.99.101:2113] | 2015-10-06 21:43:38.913

esnode1_1 | VND {f1297dca-308a-4f44-ba3b-102207b5220c} [Unknown, 172.17.0.107:1112, n/a, 172.17.0.107:1113, n/a, 172.17.0.107:2112, 172.17.0.107:2113] 7520503/7521001/7521001/E13@7504579:{f6efef66-e455-4b23-af0f-926eb97750b0} | 2015-10-06 21:43:38.894

esnode1_1 | --------------------------------------------------------------------------------

esnode3_1 | [00012,24,21:43:39.287] ELECTIONS: (V=0) TIMED OUT! (S=ElectingLeader, M=).

esnode3_1 | [00012,24,21:43:39.287] ELECTIONS: (V=1) SHIFT TO LEADER ELECTION.

esnode3_1 | [00012,24,21:43:39.287] ELECTIONS: (V=1) VIEWCHANGE FROM [172.17.0.105:2112, {e367c130-01e1-4151-ab56-674b42a317a2}].

esnode2_1 | [00012,24,21:43:39.478] ELECTIONS: (V=0) TIMED OUT! (S=ElectingLeader, M=).

esnode2_1 | [00012,24,21:43:39.478] ELECTIONS: (V=1) SHIFT TO LEADER ELECTION.

esnode2_1 | [00012,24,21:43:39.478] ELECTIONS: (V=1) VIEWCHANGE FROM [172.17.0.106:2112, {72d6e50c-bb0a-410f-85dd-a6cccfaad619}].

esnode1_1 | [00011,24,21:43:39.871] ELECTIONS: (V=0) TIMED OUT! (S=ElectingLeader, M=).

esnode1_1 | [00011,24,21:43:39.871] ELECTIONS: (V=1) SHIFT TO LEADER ELECTION.

esnode1_1 | [00011,24,21:43:39.871] ELECTIONS: (V=1) VIEWCHANGE FROM [172.17.0.107:2112, {f1297dca-308a-4f44-ba3b-102207b5220c}].

esnode3_1 | [00012,24,21:43:40.289] ELECTIONS: (V=1) TIMED OUT! (S=ElectingLeader, M=).

esnode3_1 | [00012,24,21:43:40.289] ELECTIONS: (V=2) SHIFT TO LEADER ELECTION.

esnode3_1 | [00012,24,21:43:40.289] ELECTIONS: (V=2) VIEWCHANGE FROM [172.17.0.105:2112, {e367c130-01e1-4151-ab56-674b42a317a2}].

docker ip forwarding is interesting and often requires more config.
Your errors are that the nodes are note talking to each other.

So you say my setup should work if it wasn’t running on docker?

I think the nodes can reach each other but I’m not sure they find each other if that make any sense?

Perhaps you should look into weave and https://github.com/jwilder/nginx-proxy :slight_smile:

I have looked at it, but I don’t think it will help. Those are more a help if you want to add and delete nodes dynamically, but I have sort of a static scenario at the moment that doesn’t work. Have you tried to set it up with nginx-proxy?

My guess is that it doesn’t work because I’m not using a true dns, without knowing exactly how that work. But if I read the error log it seems like it the nodes fail to locate each other through dns, and my only guess is because that is done through an actual dns lookup when that is used. I can probably use with statically configured IP, but that is error prone and annoying if I would like to increase the size of the cluster. I might be able to use https://github.com/phensley/docker-dns, but have had the time to check it out yet.

If you’re going to use DNS instead of providing gossip seeds you need to register A records for each node’s IP address. You may also have to set up different advertising addresses since there is NAT all over the show.

Thank you for clarifying how it works James, that make it a little more annoying to set up a cluster in a development environment using docker.

Is there a good reason why discovery of nodes isn’t done with broadcasting the way elasticsearch does it?

"Is there a good reason why discovery of nodes isn't done with
broadcasting the way elasticsearch does it?"

Yes since many environments don't allow for routing of broadcasts.

Is there a good reason why discovery of nodes isn’t done with broadcasting the way elasticsearch does it?

AWS (and most other clouds using SDN as far as I’m aware) don’t support broadcast.

Why do you want a cluster running on a development machine out of interest? You’d be better served with a single node in all likelihood.