Increasing the cluster size?

bvarlamov.honeywell · October 2, 2018, 9:13pm

Hi All,

I’ve looked through the posts in this Google group and reviewed the documentation, but am still unclear about how the clustering would work when it comes to changing the cluster.

In my project I am seeking to deploy an EventStore cluster via a container orchestration platform (OpenShift / kubernetes). I can specify the cluster size and the DNS as part of the deployment configuration files, so that seems clear to me. But, what about the following situations:

The initial storage I’ve allocated is insufficient and I need to allocate more
The initial amount of ES nodes (i.e. “replicas”) is insufficient to handle the processing/requests and I need to add more
I’ve over-provisioned the ES cluster and want to decommission a node to reclaim volume space, etc.

Can anyone point me at any documentation to help me understand what’s actually happening inside of the cluster? Should each ES node point at the same file location for their database? Or if I provision individual file locations to each node, will I get a copy of the data across all nodes? Or is there some sharding that happens?

Guidance/documentation on how to scale the processing and/or data persistence in some common use cases would be particularly helpful.

Thank you!

bvarlamov.honeywell · October 8, 2018, 5:29pm

Also are there maybe differences between the commercial vs. open source clusters? Any help with this is highly appreciated.

Chris_Ward1 · January 14, 2019, 9:30am

Hey there! Just to clarify a few things, as there’s a lot of variables in this email.

Are you asking how to change things specifically in an orchestrator platform, or generally making changes to an ES cluster settings?

Chris

Greg_Young1 · January 14, 2019, 8:02pm

So many of these are platform etc specific...

1. The initial storage I've allocated is insufficient and I need to allocate more

This can be done in multiple ways depending on platform. Some
platforms support dynamically allocating more space, others do not. If
not what you would generally do is take node down->allocate more
space->bring node back up or possibly take node out->reallocate entire
new node with more space->bring up new node from scratch. Which of
these two depends how automated you are, whether you have pets or
cattle.

2. The initial amount of ES nodes (i.e. "replicas") is insufficient to handle the processing/requests and I need to add more

You can just add clones by bringing up more nodes. This will help on
*reads*. For writes its more complicated. Generally speaking with
writes getting onto reasonably fast hardware will reach what is needed
(a commodity SSD can pretty easily hit 20-30k writes/second). If you
are sustaining such load you will quickly have a bigger problem of the
amount of data you need to manage. In testing a while ago I was
running a > 2 TB db

3. I've over-provisioned the ES cluster and want to decommission a node to reclaim volume space, etc.

Turn the node off. It will take a while (I think 24 hours but dont
quote me, its off the top of my head didn't look it up) to disappear
from gossip as a "known node", everything else should continue working
the only artifact is the other nodes will occasionally try to gossip
with it to see if it has "come back"