Dead Nodes. Gossip Send failing.

I have a cluster of three nodes running on 3 vagrant instances.

The nodes start and use DNS discovery to locate the IP addresses of the other nodes. All the nodes report as LIVE initially, then start to report as DEAD. Gossip send is failing but it’s unclear why.

I have attached the logs for the first node here https://gist.github.com/stuartrexking/e23d9bd96b1f457c4be9. The logs are the same for each node as is the command to start each.

Thanks in advance.

Stu

The gossip port by default is set to 30777, you need to change that to the external http port (default is 2113).

Can you explain why I would need to do that? Everything else is default.

I made the change as you suggest and I get the same result:

$ sudo eventstored --mem-db -log /var/log/eventstore.log --int-ip 172.20.20.10 --ext-ip 172.20.20.10 --cluster-size=3 --cluster-dns eventstore.service.consul --cluster-gossip-port=2113

logs: https://gist.github.com/stuartrexking/1ec7fa85bf5293a66564

from node 172.20.20.10

can your curl http://172.20.20.11:2113/gossip ?

Basically your nodes are not talking to each other (they are not
"working then not working" they are just not talking to each other)

My apologies, you need to set the Cluster Gossip Port to the Internal Http Port (Default: 2112)
So what happens is that there needs to be some chatting that happens amongst the nodes, they “gossip” over http, which means that we need an ip and a port.

By default the Cluster Gossip Port is 30777 (Which is the manager port, you don’t have a manager in your setup, which means that we need to specify the port that the nodes will know to gossip over)

Yes. I can https://gist.github.com/stuartrexking/8bf22720ac81e8a637c7, although the nodes are reported as isAlive false.

Stu

Thanks Pieter. That makes sense and solves the issue.