Briefly, this error occurs when Elasticsearch encounters an issue while trying to finalize a job, possibly due to a configuration issue, insufficient resources, or network connectivity problems. To resolve this, you can check the job configuration for any errors, ensure that the Elasticsearch cluster has enough resources (CPU, memory, disk space), and verify network connectivity between the nodes. Additionally, check the Elasticsearch logs for more detailed error messages that can help pinpoint the exact issue.
This guide will help you check for common problems that cause the log ” [{}] error finalizing job ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: task, plugin.
Overview
A task is an Elasticsearch operation, which can be any request performed on an Elasticsearch cluster, such as a delete by query request, a search request and so on. Elasticsearch provides a dedicated Task API for the task management which includes various actions, from retrieving the status of current running tasks to canceling any long running task.
Examples
Get all currently running tasks on all nodes of the cluster
Apart from other information, the response of the below request contains task IDs of all the tasks which can be used to get detailed information about the particular task in question.
GET _tasks
Get detailed information of a particular task
Where clQFAL_VRrmnlRyPsu_p8A:1132678759 is the ID of the task in below request
GET _tasks/clQFAL_VRrmnlRyPsu_p8A:1132678759
Get all the current tasks running on particular nodes
GET _tasks?nodes=nodeId1,nodeId2
Cancel a task
Where clQFAL_VRrmnlRyPsu_p8A:1132678759 is the ID of the task in the below request
POST /_tasks/clQFAL_VRrmnlRyPsu_p8A:1132678759/_cancel?pretty
Notes
- The Task API will be most useful when you want to investigate the spike of resource utilization in the cluster or want to cancel an operation.
Log Context
Log “[{}] error finalizing job” classname is OpenJobPersistentTasksExecutor.java.
We extracted the following from Elasticsearch source code for those seeking an in-depth context :
// as most of the job's close sequence has executed; just not the finalization step. The job will // restart on a different node. If the coordinating node for the close request notices that the job // changed nodes while waiting for it to close then it will remove the persistent task; which should // stop the job doing anything significant on its new node. However; the finish time of the job will // not be set correctly. logger.error(new ParameterizedMessage("[{}] error finalizing job"; jobId); e); Throwable unwrapped = ExceptionsHelper.unwrapCause(e); if (unwrapped instanceof DocumentMissingException || unwrapped instanceof ResourceNotFoundException) { jobTask.markAsCompleted(); } else if (autodetectProcessManager.isNodeDying() == false) { // In this case we prefer to mark the task as failed; which means the job