EventStore.Client does not connect to cluster

I’ve setup a super-simple test scenario just to try and get EventStore.Client connecting to an eventstore cluster, but no matter what I do, it’s just not working. Hopefully someone can help me put a finger on my issue.

First, I ran command docker network create esclienttest to create a network to host my test on.

Then, I created the following docker-compose which sets up the EventStore cluster and my test program on that network:

version: ‘3.4’

services:

esclienttest:

image: esclienttest

build:

context: .

dockerfile: ESClientTest/Dockerfile

depends_on:

  • eventstore1
  • eventstore2
  • eventstore3

eventstore1:

image: eventstore/eventstore

hostname: eventstore1

ports:

  • 2113:2113

environment:

EVENTSTORE_CLUSTER_DNS: eventstore1

EVENTSTORE_CLUSTER_SIZE: 3

EVENTSTORE_CLUSTER_GOSSIP_PORT: 2112

eventstore2:

image: eventstore/eventstore

hostname: eventstore2

environment:

EVENTSTORE_CLUSTER_DNS: eventstore1

EVENTSTORE_CLUSTER_SIZE: 3

EVENTSTORE_CLUSTER_GOSSIP_PORT: 2112

eventstore3:

image: eventstore/eventstore

hostname: eventstore3

environment:

EVENTSTORE_CLUSTER_DNS: eventstore1

EVENTSTORE_CLUSTER_SIZE: 3

EVENTSTORE_CLUSTER_GOSSIP_PORT: 2112

networks:

default:

external:

name: esclienttest

My test program contained the following code:

using EventStore.ClientAPI;

using EventStore.ClientAPI.SystemData;

using Nito.AsyncEx;

using System;

using System.Diagnostics;

using System.Linq;

using System.Net;

using System.Net.Sockets;

using System.Threading.Tasks;

namespace ESClientTest {

class Program {

static async Task Main(string[] args) {

var mre = new AsyncManualResetEvent(false);

var settings = ConnectionSettings.Create()

.UseConsoleLogger()

.UseDebugLogger()

.EnableVerboseLogging()

.SetOperationTimeoutTo(Debugger.IsAttached ? TimeSpan.FromMinutes(30) : TimeSpan.FromSeconds(2));

var connection = EventStoreConnection.Create(“ConnectTo=discover://admin:changeit@eventstore1:2112”, settings, “hello world”);

connection.Closed += (_, e) => {

};

connection.Connected += (_, e) => {

mre.Set();

};

connection.Disconnected += (_, e) => {

};

connection.ErrorOccurred += (_, e) => {

};

connection.Reconnecting += (_, e) => {

};

await connection.ConnectAsync();

await mre.WaitAsync();

}

}

}

I’ve tried every overload, every combination of cluster connection configuration I can think of. I’ve verified that the evenstore instances are definitely happily connected together.

I’ve verified that “eventstore1” does resolve to an IPv4

Here is the resulting program output:

An unhandled exception of type ‘EventStore.ClientAPI.Exceptions.CannotEstablishConnectionException’ occurred in System.Private.CoreLib.dll: ‘Cannot resolve target end point.’

You have it configured to discover over http then to connect over tcp. It is failing to discover over http.

Thank you Greg.
From inside any of the running containers, I can curl eventstore1:2113/gossip and get the http gossip data. There doesn’t seem to be any network issue and the eventstores themselves are clustering well.

I’ve spent almost a week on this now … what am I missing?

what happens if you try to telnet to the tcp ports that come back over gossip from outside?

to be clear to get the exact gossip information bring up your container then do a http get to ipaddress:httpport/gossip

Let me know if I got the commands right:

curl eventstore1:2113/gossip

{

“members”: [

{

“instanceId”: “3f5541ce-843a-4914-9d42-2883f292fffe”,

“timeStamp”: “2018-08-01T11:48:11.989709Z”,

“state”: “Slave”,

“isAlive”: true,

“internalTcpIp”: “172.21.0.4”,

“internalTcpPort”: 1112,

“internalSecureTcpPort”: 0,

“externalTcpIp”: “0.0.0.0”,

“externalTcpPort”: 1113,

“externalSecureTcpPort”: 0,

“internalHttpIp”: “172.21.0.4”,

“internalHttpPort”: 2112,

“externalHttpIp”: “0.0.0.0”,

“externalHttpPort”: 2113,

“lastCommitPosition”: 3644236,

“writerCheckpoint”: 3658976,

“chaserCheckpoint”: 3658976,

“epochPosition”: 0,

“epochNumber”: 0,

“epochId”: “b722f14e-af68-4623-82af-a76502a8b729”,

“nodePriority”: 0

},

{

“instanceId”: “8781d089-42f2-4671-b63c-683c62f44cde”,

“timeStamp”: “2018-08-01T11:48:11.986628Z”,

“state”: “Master”,

“isAlive”: true,

“internalTcpIp”: “172.21.0.3”,

“internalTcpPort”: 1112,

“internalSecureTcpPort”: 0,

“externalTcpIp”: “0.0.0.0”,

“externalTcpPort”: 1113,

“externalSecureTcpPort”: 0,

“internalHttpIp”: “172.21.0.3”,

“internalHttpPort”: 2112,

“externalHttpIp”: “0.0.0.0”,

“externalHttpPort”: 2113,

“lastCommitPosition”: 3644236,

“writerCheckpoint”: 3658976,

“chaserCheckpoint”: 3658976,

“epochPosition”: 0,

“epochNumber”: 0,

“epochId”: “b722f14e-af68-4623-82af-a76502a8b729”,

“nodePriority”: 0

},

{

“instanceId”: “540e3b52-ab3b-438c-abbf-8771189f75d0”,

“timeStamp”: “2018-08-01T11:48:12.747638Z”,

“state”: “Slave”,

“isAlive”: true,

“internalTcpIp”: “172.21.0.2”,

“internalTcpPort”: 1112,

“internalSecureTcpPort”: 0,

“externalTcpIp”: “0.0.0.0”,

“externalTcpPort”: 1113,

“externalSecureTcpPort”: 0,

“internalHttpIp”: “172.21.0.2”,

“internalHttpPort”: 2112,

“externalHttpIp”: “0.0.0.0”,

“externalHttpPort”: 2113,

“lastCommitPosition”: 3644236,

“writerCheckpoint”: 3658976,

“chaserCheckpoint”: 3658976,

“epochPosition”: 0,

“epochNumber”: 0,

“epochId”: “b722f14e-af68-4623-82af-a76502a8b729”,

“nodePriority”: 0

}

],

“serverIp”: “172.21.0.2”,

“serverPort”: 2112

}

root@eventstore1:/# telnet 172.21.0.3:1112

telnet: could not resolve 172.21.0.3:1112/telnet: Name or service not known

root@eventstore1:/# telnet 172.21.0.3:2112

telnet: could not resolve 172.21.0.3:2112/telnet: Name or service not known

root@eventstore1:/# curl 172.21.0.3:2113/gossip

{

“members”: [

{

“instanceId”: “3f5541ce-843a-4914-9d42-2883f292fffe”,

“timeStamp”: “2018-08-01T11:55:39.604055Z”,

“state”: “Slave”,

“isAlive”: true,

“internalTcpIp”: “172.21.0.4”,

“internalTcpPort”: 1112,

“internalSecureTcpPort”: 0,

“externalTcpIp”: “0.0.0.0”,

“externalTcpPort”: 1113,

“externalSecureTcpPort”: 0,

“internalHttpIp”: “172.21.0.4”,

“internalHttpPort”: 2112,

“externalHttpIp”: “0.0.0.0”,

“externalHttpPort”: 2113,

“lastCommitPosition”: 4354007,

“writerCheckpoint”: 4368736,

“chaserCheckpoint”: 4368736,

“epochPosition”: 0,

“epochNumber”: 0,

“epochId”: “b722f14e-af68-4623-82af-a76502a8b729”,

“nodePriority”: 0

},

{

“instanceId”: “8781d089-42f2-4671-b63c-683c62f44cde”,

“timeStamp”: “2018-08-01T11:55:40.39091Z”,

“state”: “Master”,

“isAlive”: true,

“internalTcpIp”: “172.21.0.3”,

“internalTcpPort”: 1112,

“internalSecureTcpPort”: 0,

“externalTcpIp”: “0.0.0.0”,

“externalTcpPort”: 1113,

“externalSecureTcpPort”: 0,

“internalHttpIp”: “172.21.0.3”,

“internalHttpPort”: 2112,

“externalHttpIp”: “0.0.0.0”,

“externalHttpPort”: 2113,

“lastCommitPosition”: 4354007,

“writerCheckpoint”: 4368736,

“chaserCheckpoint”: 4368736,

“epochPosition”: 0,

“epochNumber”: 0,

“epochId”: “b722f14e-af68-4623-82af-a76502a8b729”,

“nodePriority”: 0

},

{

“instanceId”: “540e3b52-ab3b-438c-abbf-8771189f75d0”,

“timeStamp”: “2018-08-01T11:55:39.483119Z”,

“state”: “Slave”,

“isAlive”: true,

“internalTcpIp”: “172.21.0.2”,

“internalTcpPort”: 1112,

“internalSecureTcpPort”: 0,

“externalTcpIp”: “0.0.0.0”,

“externalTcpPort”: 1113,

“externalSecureTcpPort”: 0,

“internalHttpIp”: “172.21.0.2”,

“internalHttpPort”: 2112,

“externalHttpIp”: “0.0.0.0”,

“externalHttpPort”: 2113,

“lastCommitPosition”: 4354007,

“writerCheckpoint”: 4368736,

“chaserCheckpoint”: 4368736,

“epochPosition”: 0,

“epochNumber”: 0,

“epochId”: “b722f14e-af68-4623-82af-a76502a8b729”,

“nodePriority”: 0

}

],

“serverIp”: “172.21.0.3”,

“serverPort”: 2112

}

root@eventstore1:/# telnet 172.21.0.3:2112

telnet: could not resolve 172.21.0.3:2112/telnet: Name or service not known

And

telnet 172.21.0.3 2112 ?

telnet doesn’t accept a colon in the argument in most *nix environments.

oops, thanks, that was my first *nix telnet:

After recreating that environment, the local tcp ip had shifted to 172.21.x.x for each container, and here’s what I got:

root@eventstore1:/# telnet 172.21.0.3 1112

Trying 172.21.0.3…

Connected to 172.21.0.3.

Escape character is ‘^]’.

Y��U�5?M���sS)�Connection closed by foreign host.

So it seems to work (without parsing the binary data).

This singleton connection “works” in the sense that a connection is made, of course it fails in the sense that it’s not a cluster connection:

We’ve established that the EventStore nodes are clustering well.
We’ve established that gossip on external ports is working and that connections can be made on internal and external tcp ports.

We’ve established that the client code is correct, in the form shown below (with experiments made using all the different port numbers).

What’s next?

var settings = ConnectionSettings.Create()

.UseConsoleLogger()

.UseDebugLogger()

.EnableVerboseLogging()

.SetOperationTimeoutTo(Debugger.IsAttached ? TimeSpan.FromMinutes(30) : TimeSpan.FromSeconds(2));

var connection = EventStoreConnection.Create(“ConnectTo=discover://admin:changeit@eventstore1:2112”, settings, “hello world”);

Is it possible that the EventStore client simply doesn’t work with cluster connections and no-one has tested it?

I’m using EventStore.Client v4.1.1. I’ve attempted using EventStore.APIClient.NetCore as well without success.

I’ve dug through client code and seen that it attempts to connect to the EXTERNAL tcp ports.

Meanwhile, EventStore gossip has the external ip advertised as 0.0.0.0. Could this be a part of the issue?

I’ve dug through EventStore code and seen that EventStore doesn’t accept write operations from an external tcp connection. Could this be part of the problem?

If all else fails, perhaps a five-minute skype call could help reach a solution for this? I’d be glad to make code or documentation PRs in return.

My skype id is benjamin.boyle

OK. I’ve got something working at last :slight_smile:
The key was in telling the evenstore nodes to advertise their external IP as the container’s IP instead of 0.0.0.0 or the host’s IP address.

This is because the EventStore.Client, trying to connect from a container on the same network also, always uses external IPs.

Below I’ve pasted the docker-compose file that got everything working. The solution did introduce a problem with the EventStore web UI. It broke some links and redirects, because it now tries to redirect links to the internal docker IP address instead of the host’s external IP address.

https://github.com/EventStore/ClientAPI.NetCore/issues/35