Elasticsearch Object Fields VS. Nested Field Types in OpenSearch

By Opster Expert Team

Updated: Sep 20, 2023

| 4 min read

Overview 

When defining mappings, OpenSearch will configure the fields that contain an array of objects within them as “object” type. This is fine in many cases, but sometimes the mappings will need to be adjusted. Below we will cover different scenarios and how to choose the correct mapping for every case.

Object fields

One of the advantages of using document based structures is that its properties can be grouped in a hierarchical shape. This is what we call objects.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
{
"name":"I'm an object",
"category": "single-object"
}
{ "name":"I'm an object", "category": "single-object" }
{
    "name":"I'm an object",
    "category": "single-object"
}

Objects can be embedded inside objects and go as deep as needed.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
{
"name": "Duveteuse",
"category": "dog",
"human_partner": {
"full_name": "Ami Chien",
"address": {
"street": "Jolie Rue #1234",
"city": "Paris",
"country": {
"name": "France",
"code": "FR"
}
}
}
}
{ "name": "Duveteuse", "category": "dog", "human_partner": { "full_name": "Ami Chien", "address": { "street": "Jolie Rue #1234", "city": "Paris", "country": { "name": "France", "code": "FR" } } } }
{
  "name": "Duveteuse",
  "category": "dog",
  "human_partner": {
    "full_name": "Ami Chien",
    "address": {
      "street": "Jolie Rue #1234",
      "city": "Paris",
      "country": {
        "name": "France",
        "code": "FR"
      }
    }
  }
}

It doesn’t matter how deep the object inside object relation goes because OpenSearch internally will flatten it out (see explanation below).

Arrays of objects can be created as property values.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
{
"name": "Father object",
"age": 50,
"category": "self-explaining",
"children": [
{ "name": "Child object1", "age": 1, "category": "learning-objects" },
{ "name": "Child object2", "age": 2, "category": "learning-objects" },
{ "name": "Child object3", "age": 3, "category": "learning-objects" }
]
}
{ "name": "Father object", "age": 50, "category": "self-explaining", "children": [ { "name": "Child object1", "age": 1, "category": "learning-objects" }, { "name": "Child object2", "age": 2, "category": "learning-objects" }, { "name": "Child object3", "age": 3, "category": "learning-objects" } ] }
{
  "name": "Father object",
  "age": 50,
  "category": "self-explaining",
  "children": [
    { "name": "Child object1", "age": 1, "category": "learning-objects" },
    { "name": "Child object2", "age": 2, "category": "learning-objects" },
    { "name": "Child object3", "age": 3, "category": "learning-objects" }
  ]
}

In this situation the field type matters, and sometimes we will have to switch from the default object type to a nested type.

Nested field type

What’s the nested field type in OpenSearch?

Nested is a special type of object that is indexed as a separate document, and a reference to each of these inner documents is stored with the containing document, so we can query the data accordingly.

The problem with using object fields

To demonstrate the use of object fields vs. nested field types, we’ll first index some documents. Examples can be executed in OpenSearch Dashboards.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
PUT books_test
PUT books_test
PUT books_test
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
PUT books_test/_doc/1
{
"name": "An Awesome Book",
"tags": [{ "name": "best-seller" }, { "name": "summer-sale" }],
"authors": [
{ "name": "Gustavo Llermaly", "age": "32", "country": "Chile" },
{ "name": "John Doe", "age": "20", "country": "USA" }
]
}
PUT books_test/_doc/1 { "name": "An Awesome Book", "tags": [{ "name": "best-seller" }, { "name": "summer-sale" }], "authors": [ { "name": "Gustavo Llermaly", "age": "32", "country": "Chile" }, { "name": "John Doe", "age": "20", "country": "USA" } ] }
PUT books_test/_doc/1
{
  "name": "An Awesome Book",
  "tags": [{ "name": "best-seller" }, { "name": "summer-sale" }],
  "authors": [
    { "name": "Gustavo Llermaly", "age": "32", "country": "Chile" },
    { "name": "John Doe", "age": "20", "country": "USA" }
  ]
}
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
PUT books_test/_doc/2
{
"name": "A Regular Book",
"tags": [{ "name": "free-shipping" }, { "name": "summer-sale" }],
"authors": [
{ "name": "Regular author", "age": "40", "country": "USA" },
{ "name": "John Doe", "age": "20", "country": "USA" }
]
}
PUT books_test/_doc/2 { "name": "A Regular Book", "tags": [{ "name": "free-shipping" }, { "name": "summer-sale" }], "authors": [ { "name": "Regular author", "age": "40", "country": "USA" }, { "name": "John Doe", "age": "20", "country": "USA" } ] }
PUT books_test/_doc/2
{
  "name": "A Regular Book",
  "tags": [{ "name": "free-shipping" }, { "name": "summer-sale" }],
  "authors": [
    { "name": "Regular author", "age": "40", "country": "USA" },
    { "name": "John Doe", "age": "20", "country": "USA" }
  ]
}

OpenSearch will dynamically generate these mappings:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET books_test/_mapping
GET books_test/_mapping
GET books_test/_mapping
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
{
"books_test": {
"mappings": {
"properties": {
"authors": {
"properties": {
"age": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"country": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"tags": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
{ "books_test": { "mappings": { "properties": { "authors": { "properties": { "age": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "country": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "tags": { "properties": { "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } } } } }
{
  "books_test": {
    "mappings": {
      "properties": {
        "authors": {
          "properties": {
            "age": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "country": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "name": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        },
        "name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "tags": {
          "properties": {
            "name": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      }
    }
  }
}

Let’s focus on the “authors” and “tags” fields. Both are set as “object” type fields. This means OpenSearch will flatten the properties. Document 1 will look like this:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
{
"name": "An Awesome Book",
"tags.name": ["best-seller", "summer-sale"],
"authors.name": ["Gustavo Llermaly", "John Doe"],
"authors.age": [32, 20],
"authors.country": ["Chile, USA"]
}
{ "name": "An Awesome Book", "tags.name": ["best-seller", "summer-sale"], "authors.name": ["Gustavo Llermaly", "John Doe"], "authors.age": [32, 20], "authors.country": ["Chile, USA"] }
{
  "name": "An Awesome Book",
  "tags.name": ["best-seller", "summer-sale"],
  "authors.name": ["Gustavo Llermaly", "John Doe"],
  "authors.age": [32, 20],
  "authors.country": ["Chile, USA"]
}

As you can see, the “tags” field looks like a regular string array, but the “authors” field looks different – it was split into many array fields.

The issue with this is that OpenSearch is not storing each “authors” object’s properties separately from those of every other “authors” object.

To illustrate the problem with this mapping, let’s look at the two following queries.

Query 1: Looking for books with authors from Chile or authors who are 30-years-old or younger.

Spoiler: Both books meet these conditions. 

To find books meeting these criteria, we would run the following query:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET books_test/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"authors.country.keyword": "Chile"
}
},
{
"range": {
"authors.age": {
"lte": 30
}
}
}
]
}
}
}
GET books_test/_search { "query": { "bool": { "should": [ { "term": { "authors.country.keyword": "Chile" } }, { "range": { "authors.age": { "lte": 30 } } } ] } } }
GET books_test/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "authors.country.keyword": "Chile"
          }
        },
        {
          "range": {
            "authors.age": {
              "lte": 30
            }
          }
        }
      ]
    }
  }
}

Both books are returned, which is correct because Gustavo Llermaly is from Chile, and John Doe is less than 30 years old.

Query 2: Books written by authors who are 30-years-old or younger AND are from Chile. 

Spoiler: No books meet the criteria.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET books_test/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"authors.country.keyword": "Chile"
}
},
{
"range": {
"authors.age": {
"lte": 30
}
}
}
]
}
}
}
GET books_test/_search { "query": { "bool": { "filter": [ { "term": { "authors.country.keyword": "Chile" } }, { "range": { "authors.age": { "lte": 30 } } } ] } } }
GET books_test/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "authors.country.keyword": "Chile"
          }
        },
        {
          "range": {
            "authors.age": {
              "lte": 30
            }
          }
        }
      ]
    }
  }
}

This query will also return both documents and that’s incorrect. We know that the only author from Chile is 32 years old, and therefore does not meet all the necessary criteria, but OpenSearch didn’t store this relation between the authors and ages.

How to resolve it

To accurately complete the second query, we need to use a different field type called nested. 

Nested is a special type of object that is indexed as a separate document, and a reference to each of these inner documents is stored with the containing document, so we can query the data accordingly.

We will have to change the mapping type. To change existing mappings we need to reindex our data.

First, create an empty index to avoid the OpenSearch dynamic mappings feature automatically generating mappings for our authors field:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
PUT books_test_nested
{
"mappings": {
"properties": {
"authors": {
"type": "nested"
}
}
}
}
PUT books_test_nested { "mappings": { "properties": { "authors": { "type": "nested" } } } }
PUT books_test_nested
{
  "mappings": {
    "properties": {
      "authors": {
        "type": "nested"
      }
    }
  }
}

*OpenSearch will generate all the other mappings based on the documents we index.

Now use the reindex API to move the documents from our old index to the new one:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
POST _reindex
{
"source": {
"index": "books_test"
},
"dest": {
"index": "books_test_nested"
}
}
POST _reindex { "source": { "index": "books_test" }, "dest": { "index": "books_test_nested" } }
POST _reindex
{
  "source": {
    "index": "books_test"
  },
  "dest": {
    "index": "books_test_nested"
  }
}

Run this to ensure the documents transferred properly:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET books_test_nested/_search
GET books_test_nested/_search
GET books_test_nested/_search

Now if we were to run the queries we used to answer the two questions above about books, both queries will return 0 results. This is because the nested field type uses a different type of query called nested query.

If we try to answer the questions again with nested queries, it will go as follows: 

Query 1: Looking for books with authors from Chile or authors who are 30-years-old or younger.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET books_test_nested/_search
{
"query": {
"nested": {
"path": "authors",
"query": {
"bool": {
"should": [
{
"term": {
"authors.country.keyword": "Chile"
}
},
{
"range": {
"authors.age": {
"lte": 30
}
}
}
]
}
}
}
}
}
GET books_test_nested/_search { "query": { "nested": { "path": "authors", "query": { "bool": { "should": [ { "term": { "authors.country.keyword": "Chile" } }, { "range": { "authors.age": { "lte": 30 } } } ] } } } } }
GET books_test_nested/_search
{
  "query": {
    "nested": {
      "path": "authors",
      "query": {
        "bool": {
          "should": [
            {
              "term": {
                "authors.country.keyword": "Chile"
              }
            },
            {
              "range": {
                "authors.age": {
                  "lte": 30
                }
              }
            }
          ]
        }
      }
    }
  }
}

Both books are still coming back in the results, which is perfect. 

Query 2: Books written by authors who are 30-years-old or younger and are from Chile.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET books_test_nested/_search
{
"query": {
"nested": {
"path": "authors",
"query": {
"bool": {
"filter": [
{
"term": {
"authors.country.keyword": "Chile"
}
},
{
"range": {
"authors.age": {
"lte": 30
}
}
}
]
}
}
}
}
}
GET books_test_nested/_search { "query": { "nested": { "path": "authors", "query": { "bool": { "filter": [ { "term": { "authors.country.keyword": "Chile" } }, { "range": { "authors.age": { "lte": 30 } } } ] } } } } }
GET books_test_nested/_search
{
  "query": {
    "nested": {
      "path": "authors",
      "query": {
        "bool": {
          "filter": [
            {
              "term": {
                "authors.country.keyword": "Chile"
              }
            },
            {
              "range": {
                "authors.age": {
                  "lte": 30
                }
              }
            }
          ]
        }
      }
    }
  }
}

No books are returned which is the expected result.

Why this is important

Using the nested field type for every object’s array field “just in case we need it later” sounds tempting, but it should be used exclusively, only when needed. Under the hood, Lucene is creating a new document per object in the array, and this could degrade performance or even cause a mapping explosion.

To avoid poor performance, the number of nested fields per index is limited to 50, and the number of nested objects per document is limited to 10000.

Both settings can be changed but it is not recommended:

index.mapping.nested_fields.limit

index.mapping.nested_objects.limit

If you need to index a large and unpredictable number of keyword fields on inner objects then you can use the flattened field type which maps all the object content into a single field and allows you to run basic query operations.

Summary

  • Fields based on objects or arrays of objects are created with object type by default.
  • Object field type does not support querying tied properties within individual objects.
  • Do not use nested type if there will only be one inner object per outer object.
  • Otherwise, use nested type fields if you need to query two or more fields within the same inner object, otherwise use the object type.
  • Too many nested objects could cause performance degradation or mapping explosion.
  • Use flattened field type to map all keyword fields of an inner object into a single field.

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?