Silo wide fields

Smartsite 7.9 - ...

Purpose

For unanticipated fields it is possible to introduce extra fields, for example extra_category or extra_entitytype. Enterprise Search supports:

  • to include the extra fields at index time
  • to use the extra fields at search time
  • to retrieve the extra fields of a found document for presentation purposes or other purposes.

Configuration at silo level

Extra fields can be configured at silo wide level. Adding extra fields requires recreation of the Elastic Search index associated with the silo. Removing extra fields does not require index recreation.

Some providers will automatically add provider specific extra fields. Extra fields added by the provider should not be configured at the silo level.

An extra field may be configured using the silo xml configuration. The following xml fragment is an example.

<extrafields>
<list>
<item name="extra_entitytype" datatype="keyword"/>
</list>
</extrafields>

The name of an extra field must include the leading extra_.

The datatype can be selected according to the following table. The datatype dictates the default handling of the field.

ES datatype Normalizer Analyzer
text -

Language specific analyzer

Words analyzer

Reverse words analyzer

keyword Property normalizer -
bool, byte, date, double, float, integer, long, short - -

Notes:

  • The string is not amongst the possible datatypes. Instead use text for full text analysis and use keyword in order to preserve the field text.
  • The text datatype is handled by a language specific analyzer, using the language of the silo.
  • The text datatype is additionally handled by a language-neutral words analyzer. This analyzer preserves the words of the text and supports exact word searches, exact phrase searches and exact phrase prefix searches.
  • The text datatype is additionally handled by a language-neutral reverse words analyzer. This analyzer preserves the words of the text, reversing each word, and supports exact word searches and exact word suffix searches.
  • The keyword datatype is handled by a property normalizer which converts values to lowercase and which performs asciifolding. Asciifolding for example converts ë to e. This normalizer supports exact field content searches and exact field content aggregations / facets.
  • An extra field can be used for a field that has at most one value, or for a field that can have multiple values. There is no need to specify the 1 or n case in the silo configuration.

Enterprise Search will include the extra fields when building the mapping for the silo index. It will use the mapping when recreating the index.

Structure of an extra field

An extra field can be of the following structure.

  • A single value
  • A collection of values.

A value can be of the following types:

  • String
  • Integer
  • Float
  • Boolean
  • Date time
  • Guid
  • The value null

A field, including the extra field:

  • is sent to the Elastic Search index for search purposes, which involves analysis and transformation of the field
  • is stored unchanged in the Elastic Search index using json, for purposes such as reindexing and retrieval of the original fields of a found document.

Various integer types (signed and unsiged, small and large) become an integer for storage in ES. When later retrieving the field its original type is changed, and its value is unchanged for practical purposes. The same is true for small and large floating point types, and the decimal type (with an integral and fractional part): the type becomes a float in ES and the value remains unchanged for practical purposes.