Hello I am just wondering, does enabling clustering improve read performance if I have multiple clients? In theory it sounds like it would.
Yes as you can read from multiple nodes. What are you seeing that you need more read performance for?
I don’t need it yet. But from the scalability perspective, it would be good to know if clustering will improve the read performance if I have multiple clients. Is it transparent or do I have to implement something special?
Nothing special. Are you over http? If so just put a load balancer in front.
I am using TCP actually. So if I were to connect over a cluster using multiple connections, it should connect to different nodes right?
Yes, and ensure that you aren’t specifying perform on master only.
Yes over tcp you would have multiple connections (easily instantiated) but likely this will not be your bottleneck. Moving internal communications to another network will probably give you more bang for the buck at this point
That answered my question more or less. Thanks!
If I put the cluster nodes behind a load balancer, how will the returned links in the atom feed look like? I guess they will always reflect the IP address of the currently used node? If yes, is there a way to tell the nodes on which public load-balanced address they are running?
Here is why I’m asking: In order to utilize the power of HTTP caching every request to the ES cluster should go over the load balancer. So if the atom feed contained node specific links I would have to string.replace every uri, before I can make the next request.
Have balancer rewrite links (and/or use a alternate alias for http)
Here’s how you can do it with nginx: http://wiki.nginx.org/HttpSubModule
Thanks, I will have a look at our load balancer. Greg, what did you mean with alternate alias?
Often used for thing like elastic Ips. It’s an alias for what you really are from an outside perspective.