Introduction to Haven OnDemand Unstructured Text Indexing
Use Haven OnDemand text indexes, add content, and search

Introduction to Haven OnDemand Unstructured Text Indexing

Indexing allows you to store data and documents in Haven OnDemand, and then perform search, find similar documents, find related concepts, or get parametric values for those documents.

In the same way that an Internet search engine 'indexes' the Internet to make it available to search, the Haven OnDemand indexing APIs allow you to index whatever data you want, to allow you to search it.

This page provides a brief introduction to Haven OnDemand text indexes, and the APIs you can use to get a text index up and running. You might also want to look at Text Indexes - Key Concepts.

Note: This page describes some of the text indexing APIs. You can also create text indexes, index documents, and perform basic queries from the Text Indexes section of the Account page for your Haven OnDemand user account. For more information about this method, see Manage Text Indexes.

For more in-depth information about text indexes and their usage, see Advanced Haven OnDemand Unstructured Text Indexing.

Create an Index

First, you must create a text index to store all your data in, by using the Create Text Index API. The text index is the permanent store in Haven OnDemand.

For full documentation, see Create Text Index.[sync|async]/createtextindex/v1?

This API has two required parameters:

  • index. The name of the new index.
  • flavor. The flavor of the index. The flavor defines the size and configuration of the index. For most uses, indexing normal documents for search, you typically use the Standard flavor. For testing, you can use the smaller Explorer flavor. (The full list of flavors is here.)

Optionally, you can add:

  • description. A description for your index. The description returns when you list your indexes, and makes it easier to know which index is for what.

Add Data to the Index

After you create your index, you can add data to it, by using the Add to Text Index API.

For full documentation, see Add to Text Index.[sync|async]/addtotextindex/v1

This API accepts several different inputs for the data:

  • a document file
  • a JSON object
  • a Haven OnDemand object store reference (that is, something that you have already stored in Haven OnDemand by using the Store Object API)
  • a URL (that is, something from the Internet)

You can use whichever input type you want to. The file, url, and reference input parameters are straightforward, and do not require much customization, so the following example focuses on the JSON object input.

The API has two required parameters:

  • index. The text index that you want to add the data to (that is, the one you created in the previous section).
  • json. The JSON object (or another input parameter, as described above).

The JSON object has the following form:

   "document" : [
         "title" : "This is my document",
         "reference" : "mydoc1",
         "myfield" : ["a value"],
         "content" : "A large block of text, which makes up the main body of the document."
         "title" : "My Other document",
         "reference" : "mydoc2",
         "content" : "This document is about something else."
  • document is an array of objects, each of which is a document that you want to be able to return individually. You can add multiple documents in the same document array.
  • reference is a document reference, which you can use to identify the document. If you do not include a reference, Haven OnDemand automatically generates one.
  • content is the main part of the data you store. Use this field to store the bulk of the document that you want to be able to search by text matching. For example, typically you would use content to store the body of emails, the text of a book, or the main part of a Wikipedia page.
  • myfield is a custom field name, which allows you to customize your search more. For example, myfield might be some document metadata that you want to store so that it returns in your search, but that you do not want to search for directly. You can also use some predefined field names, which have particular properties that allow you to search and filter by values in your fields. For more information about the document fields, see Index Field Types. For a list of the predefined field names, see the documentation for each flavor Index Flavors.

This example gives you a very simple idea of how to index documents manually. You can also automate the process by using a connector, which scrapes a repository and indexes the data. For more information, see Connectors.

Search Your Data

After your data is in your index, you can use the Query Text Index API to search through it.

For full documentation, see Query Text Index.[sync|async]/querytextindex/v1

This API accepts a few different inputs for the search terms, but the text option is the most like using a standard search engine. You simply pass in the text that you want to search for in your documents, and the API returns a list of documents that match your text.

There are also many optional parameters, as well as more advanced query syntax, to enhance the search feature. For more information, see Use Haven OnDemand Search Functionality and the Query Text Index API documentation.

As well as the Query Text Index API, there are several other APIs where you can use your own text indexes. The exact list depends on the flavor of text index, but for the main flavors (including Standard), you can use:

Further Reading

Now that you have the basics of text indexing, the following pages provide more information about the next stages: