Azure & ClusterNode

All,

I am trying to setup an EventStore.ClusterNode.exe in three geographically disperse regions (i.e. East US, West US, Northern Europe) with in Azure but I am having problems.

Has anyone been able to get something like this to work?

I am able to get a cluster to work if I setup a VPN with in Azure, but that ties them all to the same datacenter & affinity group.

I think the issue is around how ES does gossip? I am assuming that gossip will look at the --int-ip (or is it --ext-ip ?) and put that in the gossip message. If there was a way that I could set the “gossip” IP address to be something that isn’t a local ip address (bindable), I think I could get it to work. When Azure provisions a virtual machine. Azure provisions it with a local IP address and then on an external IP address. For instance, one of my virtual machines has a local/bindable IP address of: 100.67.x.x and a public IP address (not bindable) of: 23.96.x.x

Here is the configuration that I have setup [extra line breaks for readability]:

Virtual Machine: xxxxx-east.cloudapp.net -> internal: 100.67.118.x, external: 23.96.55.x

EventStore.ClusterNode.exe --db c:\data\es\db --log c:\data\es\log
–int-ip 100.67.118.x
–ext-ip 100.67.118.x
–int-tcp-port=1000
–ext-tcp-port=2000
–int-http-port=1001
–ext-http-port=2001
–nodes-count 3
–use-dns-discovery-
–gossip-seed 23.99.16.x:1001
–gossip-seed 191.235.133.x:1001

Virtual Machine: xxxxx-west.cloudapp.net -> internal: 100.71.172.x, external: 23.99.16.x

EventStore.ClusterNode.exe --db c:\data\es\db --log c:\data\es\log
–ext-ip 100.71.172.x
–ext-ip 100.71.172.x
–int-tcp-port=1000
–ext-tcp-port=2000
–int-http-port=1001
–ext-http-port=2001
–nodes-count 3
–use-dns-discovery-
–gossip-seed 23.96.55.x:1001
–gossip-seed 191.235.133.x:1001

Virtual Machine: xxxxx-northeurope.cloudapp.net -> internal: 100.92.38.x, external: 191.235.133.x

EventStore.ClusterNode.exe --db c:\data\es\db --log c:\data\es\log
–ext-ip 100.92.38.x
–ext-ip 100.92.38.x
–int-tcp-port=1000
–ext-tcp-port=2000
–int-http-port=1001
–ext-http-port=2001
–nodes-count 3
–use-dns-discovery-
–gossip-seed 23.96.55.x:1001
–gossip-seed 23.99.16.x:1001

Here is an example of what I am seeing in the console on all of the virtual machines (over and over again):

[02216,11,22:37:12.048] ELECTIONS: (V=0) TIMED OUT! (S=ElectingLeader, M=).
[02216,11,22:37:12.048] ELECTIONS: (V=1) SHIFT TO LEADER ELECTION.
[02216,11,22:37:12.048] ELECTIONS: (V=1) VIEWCHANGE FROM [100.92.38.57:1001, {6c
184f7c-fce4-40ff-a7f0-e7f59b011d31}].
[02216,11,22:37:13.064] ELECTIONS: (V=1) TIMED OUT! (S=ElectingLeader, M=).
[02216,11,22:37:13.064] ELECTIONS: (V=2) SHIFT TO LEADER ELECTION.
[02216,11,22:37:13.064] ELECTIONS: (V=2) VIEWCHANGE FROM [100.92.38.57:1001, {6c
184f7c-fce4-40ff-a7f0-e7f59b011d31}].
[02216,11,22:37:14.079] ELECTIONS: (V=2) TIMED OUT! (S=ElectingLeader, M=).
[02216,11,22:37:14.079] ELECTIONS: (V=3) SHIFT TO LEADER ELECTION.
[02216,11,22:37:14.079] ELECTIONS: (V=3) VIEWCHANGE FROM [100.92.38.57:1001, {6c
184f7c-fce4-40ff-a7f0-e7f59b011d31}].
[02216,11,22:37:15.079] ELECTIONS: (V=3) TIMED OUT! (S=ElectingLeader, M=).
[02216,11,22:37:15.079] ELECTIONS: (V=4) SHIFT TO LEADER ELECTION.
[02216,11,22:37:15.079] ELECTIONS: (V=4) VIEWCHANGE FROM [100.92.38.57:1001, {6c
184f7c-fce4-40ff-a7f0-e7f59b011d31}].

Thanks,

Ryan

On 2 of them you’re setting --ext-ip twice and never setting --int-ip? That would default internal ip to loop back

While this appears to be a configuration issue we should really add alias support to gossip as well like we support for external

Hey James,

I made the corrections (actually, it was correct on the server, just a bad local copy). I had shut down the VMs over the weekend, so now they have new IP addresses (see below). I have adjusted the command lines for the new IP addresses. I am still seeing the issue I wrote about in my prior post.

Here are the server IP addresses that are assigned by Azure:

EAST
External: 23.96.53.154
Internal: 100.67.202.11

WEST
External: 23.99.16.128
Internal:100.71.24.65

NE
External: 191.235.129.114
Internal: 100.92.36.107

Here is an unedited version of each command line [line breaks for readability]:

EAST:
EventStore.ClusterNode.exe
–db c:\data\es\db
–log c:\data\es\log
–int-ip 100.67.202.11
–ext-ip 100.67.202.11
–int-tcp-port=1000
–ext-tcp-port=2000
–int-http-port=1001
–ext-http-port=2001
–nodes-count 3
–use-dns-discovery-
–gossip-seed 23.99.16.128:1001
–gossip-seed 191.235.129.114:1001

WEST
EventStore.ClusterNode.exe
–db c:\data\es\db
–log c:\data\es\log
–int-ip 100.71.24.65
–ext-ip 100.71.24.65
–int-tcp-port=1000
–ext-tcp-port=2000
–int-http-port=1001
–ext-http-port=2001
–nodes-count 3
–use-dns-discovery-
–gossip-seed 23.96.53.154:1001
–gossip-seed 100.92.36.107:1001

NE:
EventStore.ClusterNode.exe
–db c:\data\es\db
–log c:\data\es\log
–int-ip 100.92.36.107
–ext-ip 100.92.36.107
–int-tcp-port=1000
–ext-tcp-port=2000
–int-http-port=1001
–ext-http-port=2001
–nodes-count 3
–use-dns-discovery-
–gossip-seed 23.96.53.154:1001
–gossip-seed 23.99.16.128:1001

Hey Guys,

Is there another way I could configure things (see below) to make them work in Azure? Or do I need to wait for (or contribute) “alias” support as Greg mentioned in a previous post to this thread?

Thanks,

Ryan

Hi Ryan,

It does looks that way - at the moment it appears you’d need to be bound to the same address in order to make it work. I don’t think that’s actually a very significant change, we’ll discuss and get back to you

Cheers,

James

I think that the key is you wouldn’t be binding it (i.e. TCP binding) at all. The alias would be the IP address (maybe even a URL) that can be used to communicate to the external port. This allows for the cases where you have a firewall in front of every instance and you are forced to communicate through that through that firewall between instances.

Also, when the services gossip, they would gossip this alias instead of the external port.

Ryan

Yes exactly. There is a very similar thing on external ip for aliasing and it’s it say elastic ip otherwise messages don’t deliver properly

There is a pull request changing command line arguments. Let me look through all of that first as it would conflict with this (needs args added as they were added for external http)

Is that #89 “Move EventStore Options over to PowerArgs”?

Ryan

Yes that’s a long running one though so maybe best for us to add this and then comment on it.

We are facing exactly the same issue when we try to launch eventstore cluster in Azure. Do we have a fix for this? Please let us know.

I have not had a chance to try this out, but the new VNET-VNET networking configuration that Azure added a little while back might solve the problem:

http://msdn.microsoft.com/en-us/library/azure/dn690122.aspx

Basically you would setup a VNET in each region, connect the VNETs to each other, then connect event store nodes using the private network that the VMs are attached to. (10.1.0.x, 10.2.0.x, etc)

LMK if you run into issues setting up the networks. I have connected the VNETs, but have not connected event store nodes on the VNETs .