Turns out our heartbeat-related issue wasn’t related to System Endpoint Protection i posted about the other day. They’ve disabled network monitoring, same problem. To recap: 50% ram free, cpu at 10-20%, no disk queues.
I increased the hb-timeout by 10 (so now (10/20 sec), no difference.
Server log says:
[PID:03928:009 2016.07.15 07:34:28.108 TRACE TcpConnectionManager] Closing connection ‘external-normal’ [127.0.0.1:56451, L127.0.0.1:1113, {a8ccb238-239b-454d-87bf-44550030b630}] cleanly. Reason: HEARTBEAT TIMEOUT at msgNum 50
[PID:03928:009 2016.07.15 07:34:28.108 INFO TcpConnection ] ES TcpConnection closed [07:34:28.108: N127.0.0.1:56451, L127.0.0.1:1113, {a8ccb238-239b-454d-87bf-44550030b630}]:Received bytes: 1655, Sent bytes: 170281
[PID:03928:009 2016.07.15 07:34:28.108 INFO TcpConnection ] ES TcpConnection closed [07:34:28.108: N127.0.0.1:56451, L127.0.0.1:1113, {a8ccb238-239b-454d-87bf-44550030b630}]:Send calls: 51, callbacks: 51
[PID:03928:009 2016.07.15 07:34:28.108 INFO TcpConnection ] ES TcpConnection closed [07:34:28.108: N127.0.0.1:56451, L127.0.0.1:1113, {a8ccb238-239b-454d-87bf-44550030b630}]:Receive calls: 51, callbacks: 50
[PID:03928:009 2016.07.15 07:34:28.108 INFO TcpConnection ] ES TcpConnection closed [07:34:28.108: N127.0.0.1:56451, L127.0.0.1:1113, {a8ccb238-239b-454d-87bf-44550030b630}]:Close reason: [Success] HEARTBEAT TIMEOUT at msgNum 50
[PID:03928:009 2016.07.15 07:34:28.108 INFO TcpConnectionManager] Connection ‘external-normal’ [127.0.0.1:56451, {a8ccb238-239b-454d-87bf-44550030b630}] closed: Success.
The connections live for 1-10 minutes. Nothing else in the logs. (no slow messages).
Client log says:
ClientAPI TcpConnection closed [07:34:28.108: N127.0.0.1:1113, L127.0.0.1:56419, {8172663d-a994-4170-937c-fc3df71c7276}]: Received bytes: 199201, Sent bytes: 2119 Send calls: 92, callbacks: 92 Receive calls: 175, callbacks: 175 Close reason: [Success] Socket closed
Question: Is this server or client initiated close? How can I debug it further? (will enable client verbose logging)
/Peter