Elasticsearch Elasticsearch Many Empty Shards in Cluster

By Opster Team

Updated: Mar 10, 2024

| 2 min read

What does this mean? 

Empty shards refers to a where a significant number of shards within an Elasticsearch cluster do not contain any data. This can lead to an unbalanced workload distribution among the nodes in the cluster and potential hotspots.

Why does this occur?

This event occurs due to the cluster shard balancer’s behavior, which ensures that all data nodes hold the same number of shards, regardless of their size or document count. As a result, some nodes may hold very active shards, while others hold many empty ones. This can lead to an uneven distribution of workload among the nodes in the cluster.

Possible impact and consequences of many empty shards

The presence of empty shards can cause the node workload to become unbalanced, leading to some nodes working much harder than others. This is due to the naive heuristic governing the shard balancing of the cluster. Consequently, this could create hot-spots of unbalanced, loaded data nodes and bottlenecks, negatively impacting the cluster’s performance.

How to resolve

To resolve the issue of many empty shards in the cluster, you can follow these recommendations:

1. Reduce the number of empty shards in the cluster by deleting the empty indices. Ensure that the indices being deleted will not be needed in the future or can be recreated if needed. Use the following command to delete an index:

curl -X DELETE <elasticsearch_endpoint>/<index_name>


2. Set up Index Lifecycle Management (ILM) or review the existing ILM. By using ILM, you can specify both the size of the shards you want to rollover and the maximum amount of time a shard can stay open to writes before rolling over. Fine-tuning the ILM policy can help prevent the creation of empty shards.

If you’re running versions from 7.17.6 to 7.17.10 or from 8.4 onwards, you can leverage the min_* conditions in your ILM policy in order to prevent rolling over empty indices. The command below shows a sample ILM policy that allows you to achieve this by making sure that the index contains at least one document in order to be rolled over:

PUT _ilm/policy/my_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover" : {
            "min_docs": 1
          }
        }
      }
    }
  }
}


3. Use the AutoOps Operator to continuously monitor and optimize shard sizes. By running the Operator against the cluster, it will use predefined rules and filters to look out for any new empty indices being created and not used. The Operator will delete them to ensure that the number of empty indices on the cluster does not grow and will continuously maintain the balance of the nodes’ loads. For more information on the Operator, read the documentation here.

Conclusion

By following this guide, you should be able to understand the meaning of having many empty shards in an Elasticsearch cluster, its causes, impacts, and the steps to resolve the issue. Implementing the recommendations provided will help you maintain a balanced workload distribution among the nodes in your cluster and improve overall performance.

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?