Briefly, this error occurs when you’re using the outdated [nested_path] parameter in your Elasticsearch query. Elasticsearch has deprecated this parameter in favour of the [nested] parameter. To resolve this issue, you should replace [nested_path] with [nested] in your query. Also, ensure that the structure of your query aligns with the new [nested] parameter requirements. This change should resolve the error and allow your query to execute successfully.
This guide will help you check for common problems that cause the log ” [nested_path] has been removed in favour of the [nested] parameter ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: sort, search.
Introduction
Sorting is an essential aspect of Elasticsearch when it comes to presenting search results in a specific order. By default, Elasticsearch sorts the results based on the relevance score, which is calculated using the Lucene scoring formula. However, there are cases where you might want to sort the results based on other criteria, such as a specific field value or a custom sorting logic. In this article, we will explore advanced techniques and best practices for sorting in Elasticsearch.
Advanced techniques and best practices for sorting in Elasticsearch
1. Sorting by Field Values
To sort the search results based on a specific field value, you can use the “sort” parameter in your search query. For example, if you want to sort the results based on the “price” field in ascending order, you can use the following query:
GET /products/_search { "query": { "match_all": {} }, "sort": [ { "price": { "order": "asc" } } ] }
2. Sorting by Multiple Fields
You can also sort the search results based on multiple fields by specifying an array of sort objects. For example, if you want to sort the results first by “category” in ascending order and then by “price” in descending order, you can use the following query:
GET /products/_search { "query": { "match_all": {} }, "sort": [ { "category": { "order": "asc" } }, { "price": { "order": "desc" } } ] }
3. Sorting with Missing Values
In some cases, the documents in your index might not have a value for the field you want to sort by. By default, Elasticsearch treats these documents as having the lowest possible value for the field. However, you can control how Elasticsearch handles missing values by using the “missing” parameter. For example, if you want to treat documents with missing “price” values as having the highest possible price, you can use the following query:
GET /products/_search { "query": { "match_all": {} }, "sort": [ { "price": { "order": "asc", "missing": "_last" } } ] }
4. Sorting with Nested Fields
If you have nested fields in your documents, you can sort the search results based on the values of these fields using the “nested” parameter. For example, if you have a “reviews” nested field with a “rating” property, you can sort the products based on the average rating as follows:
GET /products/_search { "query": { "match_all": {} }, "sort": [ { "reviews.rating": { "order": "desc", "nested": { "path": "reviews" }, "mode": "avg" } } ] }
5. Custom Sorting with Script-Based Sorting
In some cases, you might want to apply custom sorting logic that cannot be achieved using the built-in sorting options. In such cases, you can use script-based sorting to define your custom sorting logic using Painless, Elasticsearch’s scripting language. For example, if you want to sort the products based on the difference between their regular price and discounted price, you can use the following query:
GET /products/_search { "query": { "match_all": {} }, "sort": [ { "_script": { "type": "number", "script": { "source": "doc['regular_price'].value - doc['discounted_price'].value" }, "order": "desc" } } ] }
Best Practices for Sorting in Elasticsearch
- Use Doc Values: When sorting by field values, make sure to use doc values, which are the on-disk data structure that Elasticsearch uses for sorting and aggregations. Doc values are enabled by default for most field types, but if not, you can explicitly enable them by setting the “doc_values” parameter to “true” in your field mapping.
- Avoid Sorting by Text Fields: Sorting by text fields can be slow and memory-intensive, as Elasticsearch needs to load the field data into memory. Instead, use keyword fields or other field types that support doc values for sorting.
- Use Index Sorting: If you have a fixed sorting order that you use frequently, you can improve the sorting performance by using index sorting. Index sorting sorts the documents during indexing, which can speed up the sorting process during search. However, keep in mind that index sorting can increase the indexing time and memory usage.
- Optimize Pagination: When using sorting with pagination, avoid using deep pagination, as it can be slow and memory-intensive. Instead, use the “search_after” parameter to paginate through the search results more efficiently.
Conclusion
By following these advanced techniques and best practices, you can optimize the sorting process in Elasticsearch and ensure that your search results are presented in the desired order.
Overview
Search refers to the searching of documents in an index or multiple indices. The simple search is just a GET API request to the _search endpoint. The search query can either be provided in query string or through a request body.
Examples
When looking for any documents in this index, if search parameters are not provided, every document is a hit and by default 10 hits will be returned.
GET my_documents/_search
A JSON object is returned in response to a search query. A 200 response code means the request was completed successfully.
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ ... ] } }
Notes and good things to know
- Distributed search is challenging and every shard of the index needs to be searched for hits, and then those hits are combined into a single sorted list as a final result.
- There are two phases of search: the query phase and the fetch phase.
- In the query phase, the query is executed on each shard locally and top hits are returned to the coordinating node. The coordinating node merges the results and creates a global sorted list.
- In the fetch phase, the coordinating node brings the actual documents for those hit IDs and returns them to the requesting client.
- A coordinating node needs enough memory and CPU in order to handle the fetch phase.
Log Context
Log “[nested_path] has been removed in favour of the [nested] parameter” class name is ScriptSortBuilder.java. We extracted the following from Elasticsearch source code for those seeking an in-depth context :
PARSER.declareString((b; v) -> b.order(SortOrder.fromString(v)); ORDER_FIELD); PARSER.declareString((b; v) -> b.sortMode(SortMode.fromString(v)); SORTMODE_FIELD); PARSER.declareObject(ScriptSortBuilder::setNestedSort; (p; c) -> NestedSortBuilder.fromXContent(p); NESTED_FIELD); PARSER.declareObject((b; v) -> {}; (p; c) -> { throw new ParsingException(p.getTokenLocation(); "[nested_path] has been removed in favour of the [nested] parameter"; c); }; NESTED_PATH_FIELD); PARSER.declareObject((b; v) -> {}; (p; c) -> { throw new ParsingException(p.getTokenLocation(); "[nested_filter] has been removed in favour of the [nested] parameter"; c); }; NESTED_FILTER_FIELD);