There was something wrong with fetching your inbox. Please contact support here.
The Haven OnDemand Search APIs operate on the content of text indexes.
The text index is the store of document data, internally organized to make it easy for Haven OnDemand to find information for you.
The text index stores only the extracted text and metadata enrichments from your files, allowing you to search, organize, and filter the content. It is not a repository for document originals; if you want to read the whole document (with any embedded images, and so on), you can refer back to the original repository.
Haven OnDemand provides several standard text indexes, such as Wikipedia in several languages, and News websites. You can also create your own text indexes, to store any data that you want to search with Haven OnDemand APIs.
Haven OnDemand stores content in index documents. This is the searchable unit (that is, when you search for something, each result or hit is an index document).
The index document can come from a variety of different sources, and might be any length: it could be an email, a Wikipedia page, a product entry from an online catalog, a PDF, or a single tweet.
Regardless of its original format, Haven OnDemand extracts the text into a JSON object. Each attribute in the JSON object is a field in the text index.
See Also: Create Documents
The fields contain the information and content in your document. The fields can contain anything from a single character to the main text of your documents.
The most important fields are:
You can have as many different fields as you like, but Haven OnDemand gives some of them special treatment, according to the field type.
The field type determines how Haven OnDemand processes the content, and what you can do with it. For example, the content field is an index field. Index fields have special text processing, where Haven OnDemand stores information about the terms in the field so that you can easily perform a search for a keyword or phrase. Other fields are set up to allow parametric (faceted) search, or to optimize searching for numeric or date values.
If a field does not have a special field type, it is a store only field. Haven OnDemand keeps store only field content, but does not apply any additional processing. This field type is useful for content that you want to have available with a document, but which is not useful for search. For example, some document metadata (such as the content length) and related file information (such as an image URL) is kept in store only fields.
For more information about the available field types, see Index Field Types.
Every text index has a flavor, which you specify when you create it. The flavor of a text index determines a few different properties:
For more information about the available flavors, see Index Flavors.
During the indexing process, Haven OnDemand extracts the text from the index document and processes the fields.
For fields with the
index field type, it tokenizes the text into terms, removes stop words (words that are too common to add meaning), and processes information about the terms. For example, it stores the stem (the linguistic root of the word), and information about how many times a particular term occurs in a document. For more information, see Text Tokenization and Processing in Text Indexes.
For fields with special field types, it processes the field appropriately. For example, for numeric type fields it optimizes retrieval by numeric range, and for parametric type fields, it optimizes exact string matching and retrieval.