User

Smartsite 7.9 - ...

The user performs searches, using simple search boxes or using advanced search pages.

End user search input

The end user can enter a simple search query: a series of words to be searched for. This should yield a meaningful search result.

At the same time it should be possible to specify a more advanced query, for better control of the search result.

The default search query syntax is described below. Facets are not described here. Facets add another level of control.

General handling of query text

The end user can enter a query text without considering spacing, case and accents. The search result will be the same.

  • Query words are separated by spacing. Enterprise Search ignores additional spacing.
  • ES further splits words on special characters, including most punctuation characters and for example including the '-' hyphen. 
  • Text may be entered in uppercase and/or lowercase. ES ignores these differences.
  • Text may be entered with or without accents. ES applies ascii folding and for example ignores the difference between é and e.

The following queries yield the same result.

Publication Server
Publication-Server
pûblication   sérver

Full text query

ES interprets a series of words as a full text query. This is a non-exact search that may involve stemming, removal of stopwords, handling of synonyms and more.

publication server
a publication server
publication servers

These queries may yield the same result:

  • The stopword a is removed
  • The words server and servers are stemmed to some root form that is the same for both words.

Phrase query

A word or a series of words can be enclosed by " and " to request a search for the exact phrase.

"form"
"address form"
"address-form"

Search explanation:

  • "form": the word form must be present in order to find a document.
  • "address form": the word address followed by the word form must be present in order to find a document. A search for address form would be a full text search and may yield many results whereas a search for "address form" is a phrase search and may yield a few results only.
  • "address-form". this matches address form and this matches address-form. In both cases the word address is followed by the word form; the hyphen '-' is not considered.

The search is exact, however the above general handling of query text still applies:

  • Spacing insensitive
  • Case insensitive
  • Accent insensitive.

Must query

The above queries are should queries: searched words should appear in found documents. A document can be found even if not all words are present in the document. This both applies to

  • Full text queries
  • Phrase queries.

Use + to mark a part of the query as a must query.

+"address form"
+publication server
+publication +server
publication+server

Search explanation:

  • +"address form": the phrase "address form" must be present in order to find a document.
  • +publication server: the full text search for publication must find the document; the full text search for server should find the document.
  • +publication +server: the full text search for publication server must find the document.
  • publication+server: this is a regular full text search for publication and server. In order to be a must query for server the '+' must be preceeded by spacing, or must be at the begin, but must not be attached to the preceeding word publication.

Must not query

Use - to mark a part of the query as a must not query.

-"address form" publication server
-publication server
-publication -server address form
publication-server

Search explanation:

  • -"address form" publication server: the phrase "address form" must not be present in order to find a document. Note that stating a must not on its own typically results in finding many documents: all documents in the index, except a few excluded documents. The example query is combined with a search for publication server: perform a full text search for publication server, filtering documents with the exact phrase address form.
  • -publication server: the full text search for server should find the document; however filter documents found by means of a full text search for publication.
  • -publication -server address form: the full text search for adress form should find the document; however filter documents found by means of a full text search for publication server.
  • publication-server: this is a regular full text search for publication and server. In order to be a must not query for server the '-' must be preceeded by spacing, or must be at the begin, but must not be attached to the preceeding word publication.

Prefix query

A prefix query is a phrase query that searches for a prefix. Use a prefix followed by *.

"publicatio*"
"publication serve*"

Search explanation:

  • "publicatio*": a word starting with publicatio should be present in order to find a document.
  • "publication serve*": the word publication followed by a word starting with serve should be present in order to find a document.

A prefix query is a phrase query in all cases. The following full text query syntax is wrong, but accepted, and is handled as a prefix query.

publicatio*
publication serve*

Search explanation:

  • publicatio*: this is the same as "publicatio*".
  • publication serve*: this is not the same as "publication serve*". The full text search word publication should be present in the found documents, and the word starting with serve should be present in the found documents.

Prefix queries are fast. The default search does not support wildcard queries such as "pub*cation" because this may severely impact performance. Enterprise Search will accept a search query "pub*cation", handling this as a full text query for the words pub an cation.

Suffix query

A suffix query is a phrase query that searches for a suffix. Use a * followed by the suffix.

"*ublication"

Search explanation:

  • "*ublication": a word ending with ublication should be present in order to find a document.

A suffix query is a phrase query in all cases. The following full text query syntax is wrong, but accepted, and handled as a suffix query.

*ublication

Search explanation:

  • *ublication: this is the same as "*ublication".

Single word suffix queries are fast. The default search does not support multiword suffix queries such as "*ublication server" because this may severely impact performance. Enterprise Search will accept a search query "*ublication server", however probably finding no results or a few documents only.

All documents query

It is possible to request all documents. Use a * for this.

*

Search explanation:

  • *: returns all documents in the index or indexes configured for the search application. Possibly the user plans to reduce the search result by means of facets.

Combined queries

The above queries can be combined. A query can be:

  • A full text query
  • A phrase query. This includes an exact word query, exact phrase query, prefix query and suffix query.

A query has an occurrence type, in any combination with the above full text or phrase:

  • A should query
  • A must query
  • A must not query.

A query can consist of multiple parts:

+publication "address" +server -"test*" more +"mandatory" -"optional"

Search explanation: each phrase query results in an individual search clause.

  • "address": a phrase query, performed as a should clause.
  • -"test*": a prefix phrase query performed as a must not clause.
  • +"mandatory": a phrase query performed as a must clause.
  • -"optional*": a prefix phrase query performed as a must not clause.

Full text queries are accumulated per occurrence type:

  • +publication +server: accumulation results in a full text search for publication server, performed as a must clause.
  • more: this results in a full text search for more, performed as a should clause.

-"test*" and -"optional*" are not accumulated, being phrase queries, despite both being must not clauses.

+publication +server and more are not accumulated, despite both being full text queries, because they are of different occurrence type, must respectively should.