Elasticsearch: Disk Underutilization on Data Tier

By Opster Team

Updated: Mar 10, 2024

| 2 min read

What does this mean?

If there is more disk space allocated to data nodes in the Elasticsearch cluster than needed, this means that the disk resources are not being used efficiently, and there is potential to reduce costs by optimizing disk utilization.

Why does this occur?

This event can occur due to various reasons, such as:

Overestimation of storage requirements during the initial setup of the cluster.
Decrease in data volume over time, leading to unused disk space.
Inefficient data management practices, such as deleting old or unnecessary data without revising storage requirements.
Removal of some replica shards that were added to support a high usage peak.

Possible impact and consequences of low disk utilization

The possible impact of disk underutilization in an Elasticsearch cluster includes:

Increased costs: Allocating more disk space than needed can lead to higher infrastructure costs, not only storage-wise, but if the provisioned data storage resulted from a specific memory-to-disk ratio, you might also be paying for too much RAM.
Suboptimal performance: Underutilized disk space can result in inefficient resource usage, which can affect the overall performance of the cluster.

How to resolve

To resolve the issue of disk underutilization in an Elasticsearch cluster, you can consider the following recommendations:

1. Move to smaller disk capacity: By moving to smaller disks, you can reduce the allocated disk space and achieve optimal disk utilization. This can be done by resizing the existing disks or replacing them with smaller ones.

2. Reduce the number of data nodes: Reducing the number of data nodes can help in decreasing the overall disk allocation, thereby reducing the cluster cost. This can be done by updating the cluster settings or removing unnecessary data nodes.

Command example to drain data to other nodes so that the specified node can be deprovisioned:

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.exclude._ip": "10.0.0.1"
  }
}

3. Optimize data management practices: Regularly review your data management practices (especially ILM) to ensure that you are deleting old or unnecessary data, and optimizing the use of disk space.

Conclusion

By following the recommendations provided in this guide, you can resolve the issue of disk underutilization in your Elasticsearch cluster. This will help you achieve optimal disk utilization, save money, and improve the overall performance of your cluster.

Elasticsearch Elasticsearch Disk Underutilization on Data Tier

What does this mean?

Why does this occur?

Possible impact and consequences of low disk utilization

How to resolve

Conclusion