Unable to connect to non-containerised cluster from Docker container

Hey guys,

I seem to be having issues connecting to an on-premise (non-containerised) EventStore v5.0.5 cluster from a docker container.

I can successfully connect using a single node connection string and I am able to successfully connect using a cluster connection string from a (non-containerised) .Net Core client v5.0.5 running on windows although when I try from inside of a Docker linux container using a cluster connection string I receive the following error.

[DEBUG] EventStoreConnection 'ConfigureEventStore': CloseTcpConnection IGNORED because _connection == null.
[INFO] EventStoreConnection 'ConfigureEventStore': Closed. Reason: Failed to resolve TCP end point to which to connect...
[Error] [ConfigureEventStore.Program] ["ExceptionDetail": {"HResult": -2146233088, "Message": "Cannot resolve target end point.", "Source": "System.Private.CoreLib", "InnerException": {"Type": "System.AggregateException", "HResult": -2146233088, "Message": "One or more errors occurred. (Failed to discover candidate in 10 attempts.)", "Source": null, "InnerException": {"HResult": -2146233088, "Message": "Failed to discover candidate in 10 attempts.", "Source": "EventStore.ClientAPI", "Type": "EventStore.ClientAPI.Exceptions.ClusterException"}, "InnerExceptions": [{"HResult": -2146233088, "Message": "Failed to discover candidate in 10 attempts.", "Source": "EventStore.ClientAPI", "Type": "EventStore.ClientAPI.Exceptions.ClusterException"}]}, "Type": "EventStore.ClientAPI.Exceptions.CannotEstablishConnectionException"}}] Failed to connect to EventStore
EventStore.ClientAPI.Exceptions.CannotEstablishConnectionException: Cannot resolve target end point. ---> System.AggregateException: One or more errors occurred. (Failed to discover candidate in 10 attempts.) ---> EventStore.ClientAPI.Exceptions.ClusterException: Failed to discover candidate in 10 attempts.
at EventStore.ClientAPI.Internal.ClusterDnsEndPointDiscoverer.<>c__DisplayClass10_0.<DiscoverAsync>b__0()
at System.Threading.Tasks.Task`1.InnerInvoke()
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of inner exception stack trace ---

The container is able to ping the cluster dns and hit the node ip:port individually via telnet. I have tried both Cluster discovery and Gossip seed configuration with no success. Here is the response from the gossip endpoint.

{
    "members": [
        {
            "instanceId": "443b3a7a-12de-4dea-b425-dc82bed323e4",
            "timeStamp": "2020-01-10T11:32:54.9347998Z",
            "state": "Slave",
            "isAlive": true,
            "internalTcpIp": "XXX.XX.144.137",
            "internalTcpPort": 1111,
            "internalSecureTcpPort": 0,
            "externalTcpIp": "XXX.XX.144.137",
            "externalTcpPort": 1112,
            "externalSecureTcpPort": 0,
            "internalHttpIp": "XXX.XX.144.137",
            "internalHttpPort": 2113,
            "externalHttpIp": "XXX.XX.144.137",
            "externalHttpPort": 2114,
            "lastCommitPosition": 5872249752,
            "writerCheckpoint": 5872265841,
            "chaserCheckpoint": 5872265841,
            "epochPosition": 5455549044,
            "epochNumber": 436,
            "epochId": "1f66fd6b-fac1-488f-a35d-b9266d2dd09f",
            "nodePriority": 10
        },
        {
            "instanceId": "6a2e34fe-4823-4b9a-8e0a-6faaf801168b",
            "timeStamp": "2020-01-10T11:32:54.980957Z",
            "state": "Slave",
            "isAlive": true,
            "internalTcpIp": "XXX.XX.164.155",
            "internalTcpPort": 1111,
            "internalSecureTcpPort": 0,
            "externalTcpIp": "XXX.XX.164.155",
            "externalTcpPort": 1112,
            "externalSecureTcpPort": 0,
            "internalHttpIp": "XXX.XX.164.155",
            "internalHttpPort": 2113,
            "externalHttpIp": "XXX.XX.164.155",
            "externalHttpPort": 2114,
            "lastCommitPosition": 5872249752,
            "writerCheckpoint": 5872265841,
            "chaserCheckpoint": 5872265841,
            "epochPosition": 5455549044,
            "epochNumber": 436,
            "epochId": "1f66fd6b-fac1-488f-a35d-b9266d2dd09f",
            "nodePriority": 20
        },
        {
            "instanceId": "9081c7a3-c414-463b-af82-836cfc5b19fa",
            "timeStamp": "2020-01-10T11:32:55.8091464Z",
            "state": "Master",
            "isAlive": true,
            "internalTcpIp": "XXX.XX.164.154",
            "internalTcpPort": 1111,
            "internalSecureTcpPort": 0,
            "externalTcpIp": "XXX.XX.164.154",
            "externalTcpPort": 1112,
            "externalSecureTcpPort": 0,
            "internalHttpIp": "XXX.XX.164.154",
            "internalHttpPort": 2113,
            "externalHttpIp": "XXX.XX.164.154",
            "externalHttpPort": 2114,
            "lastCommitPosition": 5872249752,
            "writerCheckpoint": 5872265841,
            "chaserCheckpoint": 5872265841,
            "epochPosition": 5455549044,
            "epochNumber": 436,
            "epochId": "1f66fd6b-fac1-488f-a35d-b9266d2dd09f",
            "nodePriority": 20
        }
    ],
    "serverIp": "XXX.XX.164.154",
    "serverPort": 2113

Here is the connection string I have tried:

ConnectTo=discover://admin:[email protected]:2113;GossipTimeout=1000;VerboseLogging=True;
GossipSeeds=XXX.XX.164.154:1112,XXX.XX.144.137:1112,XXX.XX.164.155:1112;VerboseLogging=True;GossipTimeout=1000;HeartBeatTimeout=500;

Any help would be appreciated.

Hi,

In the connection string you are using the wrong port (2113) whereas you should be using the 1113 port.

Hope it helps,

Vlad

I tried port 1113 for both Cluster DNS and Gossip Seed connection strings and neither could successfully connect.

I am able to connect to the cluster using port 2113 from a non-containerised client.

I resolved the issue by specifying the node IP addresses as part of the no_proxy environment variable inside the container.

Despite the client being able to connect to the master node on a single node connection string without any proxy configuration this wasn’t the case for the cluster connection string.