Slave not rejoining cluster after reboot

eol · January 11, 2022, 10:07am

Hi,

I have a problem with my 3 node cluster. It has been running well for a while now, but when I rebooted one of my slave nodes, it won’t join the cluster again when booted.

I’ve checked my configs, an nothing has changed. I’ve rebooted the nodes seperetly before, and they usually just join automatically. The cluster is setup with gossip seeds.

Any help would be much appreciated, thanks!

On my rebooted slave I have the following in my logs:

[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] Looks like node [{other_slave}:30777] is DEAD (Gossip send failed).
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] CLUSTER HAS CHANGED (gossip send failed to [{other_slave}:30777])
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] Old:
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <LIVE> [Manager, {master}:30777, {master}:30777] | 2022-01-10 12:11:48.042
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] MAN {cc602019-84a4-4d95-beed-b69a1032a938} <LIVE> [Manager, {rebooted_slave}:30777,  {rebooted_slave}:30778] | 2022-01-10 12:11:48.870
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <LIVE> [Manager, {other_slave}:30777, {other_slave}:30777] | 2022-01-10 12:11:48.042
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] New:
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <LIVE> [Manager, {master}:30777, {master}:30777] | 2022-01-10 12:11:48.042
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] MAN {cc602019-84a4-4d95-beed-b69a1032a938} <LIVE> [Manager,  {rebooted_slave}:30777,  {rebooted_slave}:30778] | 2022-01-10 12:11:48.870
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <DEAD> [Manager, {other_slave}:30777, {other_slave}:30777] | 2022-01-10 12:11:48.902
[PID:02292:012 2022.01.10 12:11:48.902 TRACE GossipServiceBase   ] --------------------------------------------------------------------------------
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] Looks like node [{master}:30777] is DEAD (Gossip send failed).
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] CLUSTER HAS CHANGED (gossip send failed to [{master}:30777])
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] Old:
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <LIVE> [Manager, {master}:30777, {master}:30777] | 2022-01-10 12:11:48.042
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] MAN {cc602019-84a4-4d95-beed-b69a1032a938} <LIVE> [Manager,  {rebooted_slave}:30777,  {rebooted_slave}:30778] | 2022-01-10 12:11:48.979
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <DEAD> [Manager, {other_slave}:30777, {other_slave}:30777] | 2022-01-10 12:11:48.902
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] New:
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <DEAD> [Manager,{master}:30777, {master}:30777] | 2022-01-10 12:11:49.041
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] MAN {cc602019-84a4-4d95-beed-b69a1032a938} <LIVE> [Manager,  {rebooted_slave}:30777,  {rebooted_slave}:30778] | 2022-01-10 12:11:48.979
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] MAN {00000000-0000-0000-0000-000000000000} <DEAD> [Manager, {other_slave}:30777, {other_slave}:30777] | 2022-01-10 12:11:48.902
[PID:02292:012 2022.01.10 12:11:49.041 TRACE GossipServiceBase   ] --------------------------------------------------------------------------------