Categorization Flavor Index Configuration

The Categorization flavor is a special index flavor for storing categories for use with the Document Categorization API.

Each text index flavor has a static resource unit cost, determined by the maximum index size and specialization. For more information about the resource unit costs for each flavor, see API and Resource Unit Consumption.

Index Size

The Categorization flavor supports a maximum of 1GB of index data.

You can create up to 5 custom parametric fields.

Standard Fields

For information about the field types, see Index Field Types.

Note: Field names in Haven OnDemand are not case sensitive. For example, the TITLE field name is equivalent to Title or title.

Field Name Field Type Default Print Field
CONTENT Index field  
DATE Date field  
CREATED_DATE Date field  
MODIFIED_DATE Date field  
LON Numeric field  
LAT Numeric field  
CATEGORY Parametric field yes
REFERENCE Reference field yes
IODREFERENCE Reference field  
BOOLEANRESTRICTION Special field type. See Additional Configuration  
FIELDTEXTRESTRICTION Special field type. See Additional Configuration  
ALWAYS_MATCH Special field type. See Additional Configuration  

Non-Standard Fields

Haven OnDemand stores any non-standard fields in your documents. You cannot use these fields for retrieval, but you can display them in results by using the Print API parameter.

Additional Configuration

The categorization flavor has additional configuration for the following fields:

  • CONTENT. This field contains a list of terms that provide an initial match for documents that you want the category to match. Haven OnDemand evaluates the BOOLEANRESTRICTION and FIELDTEXTRESTRICTION rules only after it matches at least one term in the CONTENT field, unless the ALWAYS_MATCH field is present. Haven OnDemand indexes the CONTENT field and uses its contents as an initial matching step in queries to the Document Categorization API.

  • BOOLEANRESTRICTION. This field stores a Boolean or Proximity expression that describes the documents that you want the category to match. See Boolean and Proximity Operators. Haven OnDemand caches the values of this field and uses its contents in queries to the Document Categorization API.

  • FIELDTEXTRESTRICTION. This field stores a field_text expression that describes the documents that you want the category to match. See Field Text Operators. Haven OnDemand caches the values of this field and uses its contents when you send queries to the Document Categorization API with a JSON document as input.

  • ALWAYS_MATCH. When a document has this field, Haven OnDemand always evaluates the document BOOLEANRESTRICTION and FIELDTEXTRESTRICTION in queries to the Document Categorization API, regardless of matches against the CONTENT field. Add this field with the value true in cases where you cannot use CONTENT to match all documents that might match the BOOLEANRESTRICTION, for example because the Boolean expression contains wildcard values.