Trend Analysis

Discovers significant changes between two groups of records.

The Trend Analysis API discovers significant changes and trends between two groups of records. You provide a set of structured data that is split into two sets (for example, for two different time periods), and the API lists the changes between them. The API analyzes all combinations of the data that you provide to find the most significant differences. This API uses a novel analytics operation, developed at Hewlett Packard Labs.

Quick Start

The API has several required parameters. You must use an input parameter (file, url, or reference) to specify the file that you want to analyze. You must also specify the name of the column that labels the groups that you want to compare (groups_column, and the names of the two groups in this column that you want to compare (main_group and compared_group.

For example, if your data has a column named date, and you want to compare the data for August with the data for September:

curl -X POST http://api.havenondemand.com/1/api/[async|sync]/trendanalysis/v1 --form "file=@sales.csv" --form "groups_column=date" --form "main_group=August" --form "compared_group=September"

You must provide the data that you want to analyze in CSV format. The CSV file must have the following structure:

  • The first row must be the comma-separated list of column headers.
  • The second row must be a list of the types for each of the columns. The following types are supported:
    • STRING
    • NUMERIC
  • The rest of the rows contain the data, with one record on each row, and column values that correspond to the specified headers and types.

By default, the API considers the values in all columns of your data when it searches for trends. It inspects all possible combinations of attributes to find the elements that have the most significant changes between the two groups that you have specified.

You can use the columns parameter to limit the analysis to particular columns in your data. For example, if you have types of trends that you particularly want to consider (or columns that you want to exclude), you can limit the analysis to only these values.

The following example restricts the previous example to only examine the product_names, price, total_sales, and region columns.

curl -X POST http://api.havenondemand.com/1/api/[async|sync]/trendanalysis/v1 --form "file=@sales.csv" --form "groups_column=date" --form "main_group=August" --form "compared_group=September" --form "columns=product_names" --form "columns=price" --form "columns=total_sales" --form "columns=region"

By default, the API uses the number of records in a particular set to compare trends between the different groups. For example, if you have a region column and customer_type column, the API compares the number of records in the first group with a particular combination of region and customer type to the number of records in the second group with the same combination.

You can use the aggregation_column parameter to select a column in your dataset that you want to compare by summing the values, rather than counting. For example, if you have the columns region, customer_type, and Sales, you can set aggregation_column to Sales. In this case, the API compares the total of the values in the sales column for records with a particular combination of region and customer type in each group.

curl -X POST http://api.havenondemand.com/1/api/[async|sync]/trendanalysis/v1 --form "file=@sales.csv" --form "groups_column=date" --form "main_group=August" --form "compared_group=September" --form "aggregation_column=sales"

You can restrict the results to only trends that you are particularly interested in by using the terms parameter. The API analyzes all the combinations and trends, but returns only the trends that contain at least one of the values that you specify.

curl -X POST http://api.havenondemand.com/1/api/[async|sync]/trendanalysis/v1 --form "file=@sales.csv" --form "groups_column=date" --form "main_group=August" --form "compared_group=September" --form "terms=EMEA"

This example returns only trends that include the value EMEA.

Get the Results of an Asynchronous Request

Note: Haven OnDemand recommends that you use the asynchronous version of the API for most trend analysis. Typically, you use this API with larger files that could timeout if processed synchronously.

The asynchronous mode returns a job-id, which you can then use to extract your results. There are two methods for this:

  • Use /1/job/status/ to get the status of the job, including results if the job is finished.
  • Use /1/job/result/, which waits until the job has finished and then returns the result.

    Note: Because /result has to wait for the job to finish before it can return a response, using it for longer operations such as processing a large video file can result in an HTTP request timeout response. The /result method returns a response either when the result is available, or after 120 seconds, whichever is sooner. If the job is not complete after 120 seconds, the /result method returns a code 7010 (job result request timeout) response. This means that your asynchronous job is still in progress. To avoid the timeout, use /status instead.

Demonstration

Visit the following web pages for a blog post about the Trend Analysis tool, with instructions and sample data for using a demonstration, and commented results.

A pilot version of the demonstration can be accessed here.

Note: the demonstration works best in a browser other than Internet Expolorer; the recommended browser is Chrome.

Synchronous
https://api.havenondemand.com/1/api/sync/trendanalysis/v1
Asynchronous
https://api.havenondemand.com/1/api/async/trendanalysis/v1
Authentication

This API requires an authentication token to be supplied in the following parameter:

Parameter Description
apikey The API key to use to authenticate the API request.
Parameters

This API accepts the following parameters:

Required
Name Type Description
file
binary The CSV file for analysis.
reference
string A Haven OnDemand reference obtained from either the Expand Container or Store Object API.
url
string A publicly accessible HTTP URL from which the CSV document to analyze can be retrieved.
groups_column
string The name of the column that labels the two groups that you want to compare.
compared_group
string The category to compare to the main category (that is, the 'reference' category).
main_group
string The main category to use as the basis for the comparison (that is, the 'current' category).
Optional
Name Type Description
aggregation_column
string The name of the column in the data set that contains the numeric values to summarize in the analysis. If you do not provide a column, the API counts the different category values and calculates trends based on that.
columns
array<string> The list of columns to analyze. Use this parameter to restrict the analysis scope to combinations of the columns that you specify. An exact match is required.
terms
array<string> A list of terms to include in the analysis. Use this parameter to restrict the analysis scope to trends that contain the terms that you specify. An exact match is required.

This API returns a JSON response that is described by the model below. This single model is presented both as an easy to read abstract definition and as the formal JSON schema.

Asynchronous Use

Additional requests are required to get the result if this API is invoked asynchronously.

You can use /1/job/status/<job-id> to get the status of the job, including results if the job is finished.

You can also use /1/job/result/<job-id>, which waits until the job has finished and then returns the result.

Model
This is an abstract definition of the response that describes each of the properties that might be returned.
Trend Analysis Response {
trend_collections ( array[Trend_collections] , optional) Array of collections of similar trends. The collections returned are associated to the top trends (according to a computed score).
}
Trend Analysis Response:Trend_collections {
trends ( array[object] , optional) The list of trends in a collection. The first trend in the collection is the main trend, the others are trends related to the first one
}
Model Schema
This is a JSON schema that describes the syntax of the response. See json-schema.org for a complete reference.
{
    "properties": {
        "trend_collections": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "trends": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "items": {
                                "measure": {
                                    "type": "array",
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "column": {
                                                "type": "string"
                                            },
                                            "value": {
                                                "type": "string"
                                            }
                                        }
                                    }
                                },
                                "trend": {
                                    "type": "string"
                                },
                                "measure_percentage_main_group": {
                                    "type": "number"
                                },
                                "measure_value_main_group": {
                                    "type": "number"
                                },
                                "main_trend": {
                                    "type": "string"
                                },
                                "score": {
                                    "type": "number"
                                },
                                "measure_percentage_compared_group": {
                                    "type": "number"
                                },
                                "category": {
                                    "type": "array",
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "column": {
                                                "type": "string"
                                            },
                                            "value": {
                                                "type": "string"
                                            }
                                        }
                                    }
                                },
                                "measure_value_compared_group": {
                                    "type": "number"
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    "required": [],
    "type": "object"
}
https://api.havenondemand.com/1/api/sync/trendanalysis/v1
/api/api-example/1/api/sync/trendanalysis/v1
Examples
See this API for yourself - select one of our examples below.
Analyze Trends
Analyze trends in CSV data file
Parameters
Required
Select file Change Remove
Name Type Value
groups_column
string
compared_group
string
main_group
string
Optional
Name Type Value
aggregation_column
string
columns
array
Add another value
terms
array
Add another value


ASync – Response An error occurred making the API request
Response Code:
Response Body

	
Making API Request…
Checking result of job

To try this API with your own data and use it in your own applications, you need an API Key. You can create an API Key from your account page - API Keys.

Output Refresh An error occurred making the API request View Input
Rendered RawHtml Response
Result Display
Response Code:
Response Body:

			
Make this call with curl


If you would like to provide us with more information then please use the box below:

We will use your submission to help improve our product.