Elasticsearch Mastering Elasticsearch Remote Clusters

By Opster Team

Updated: Nov 6, 2023

| 2 min read

Quick links

Understanding Remote Clusters

Elasticsearch remote clusters offer a powerful way to connect multiple clusters and perform cross-cluster operations. This feature is particularly useful in scenarios where data is distributed across different geographical locations, and there is a need to perform search, aggregation, or other operations across these clusters.

A remote cluster is essentially an Elasticsearch cluster that is registered to a local cluster, with the local cluster having the ability to run search and other operations on the remote cluster. This is achieved through a special purpose connection, which is separate from the standard transport connections that are used for inter-node communication within a cluster.

Configuring Remote Clusters

To configure a remote cluster, you need to add certain settings to the `elasticsearch.yml` configuration file. The `cluster.remote` setting is used to define the remote clusters that the local cluster can connect to.

Here is an example of how to configure a remote cluster:

cluster:
  remote:
    cluster_one:
      seeds: 192.168.1.1:9300
    cluster_two:
      seeds: 192.168.1.2:9300

In this example, `cluster_one` and `cluster_two` are the aliases for the remote clusters, and `seeds` are the addresses of one or more nodes in the remote cluster. The communication between the local and remote clusters happens at the transport level (i.e., TCP), so the specified port must be in the 9300-9400 range, or as configured by the `transport.port` setting.

It is also possible to configure a remote cluster dynamically by updating the cluster settings directly. Dynamic settings will override the static ones set in the `elasticsearch.yml` configuration file:

PUT /_cluster/settings
{
  "persistent" : {
    "cluster" : {
      "remote" : {
        "cluster_one" : {    
          "seeds" : [
            "192.168.1.1:9300" 
          ]
        },
        "cluster_two" : {    
          "seeds" : [
            "192.168.1.2:9300" 
          ]
        }
      }
    }
  }
}

Cross-Cluster Search

Once the remote clusters are configured, you can perform cross-cluster search operations. The cross-cluster search feature allows executing searches on one or more indices of the remote clusters.

Here is an example of a cross-cluster search request:

GET /cluster_one:index_one,cluster_two:index_two/_search
{
  "query": {
    "match_all": {}
  }
}

In this example, the search is performed on `index_one` of `cluster_one` and `index_two` of `cluster_two`.

Cross-Cluster Replication

Cross-cluster replication is another feature that can be used with remote clusters. It allows replicating indices from a remote cluster to a local cluster. This is particularly useful for creating offsite backups, or for improving search performance by serving search requests from a closer geographical location.

Here is an example of how to create a follower index:

PUT /follower_index/_ccr/follow
{
  "remote_cluster" : "cluster_one",
  "leader_index" : "leader_index"
}

In this example, `follower_index` is the index on the local cluster, `cluster_one` is the remote cluster, and `leader_index` is the index on the remote cluster that is being followed and replicated into `follower_index`.

It is worth noting that the cross-cluster replication feature requires at least a Platinum license. However, you can still try this feature by converting your basic license to a trial one, which gives you access to all features for 30 days.

Handling Failures

There are few things that can go wrong when configuring remote clusters, such as connection issues or security issues. Make sure to consult the official documentation to find out how to resolve all the known problematic cases.

Security Considerations

When configuring remote clusters, it’s important to consider the security implications. The communication between the local and remote clusters should be secured using features like TLS/SSL encryption, and access to the remote clusters should be controlled using features like role-based access control.

Conclusion

In conclusion, Elasticsearch remote clusters provide a powerful way to connect multiple clusters and perform operations across them. With proper configuration and security considerations, they can be a valuable tool in a distributed data environment. 

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?