Web pivot tables by ElasticSearch Data Source Setup
appsettings.json
file PivotDataService:Cubes
section.Cube config for ElasticSearch should have
"SourceType": "elasticsearch"
and an additional "SourceElasticSearch"
section:
{ "Id": "elastic-bank-accounts", "Name": "Bank Accounts from sample ElasticSearch dataset", "SourceType": "elasticsearch", "SourceElasticSearch": { "ConnectionUrl": "https://ReadOnly:nxxr8qt7n26c7jwijhwg1ipnarf402yn@thorin-us-east-1.searchly.com", "Index": "bank", // optional: define filtering conditions to limit number of documents to scan //"FilterRelex": " timestamp >= "2019-01-01":datetime " }, "InferSchema": true, /* dimensions and measures are determined automatically by first N documents */ "Dimensions": [], "Measures": [], "Parameters": [] }
"ConnectionURL" specifies base URL of your ElasticSearch API.
"Index" specifies index name to query; it can contain wildcards
if you want to search in multiple indexes, for example: logs-myservice-*
.
"FilterRelex" allows to specify filtering condition with syntax used in NReco.Data relex.
NOTE: "InferSchema" makes additional query to determine columns and their types. In production use it is better to specify list of dimensions and measures explicitely to avoid some overhead (startup delay) caused by probe query.
Dimensions
In case of ElasticSearch data source dimension name refers to the document or sub-document field name:
"Dimensions": [ { "Name": "name.keyword", /* don't forget to specify '.keyword' suffix for text fields */ "LabelText": "Name" }, { "Name": "address.street.keyword", /* use dot to specify sub-document field */ "LabelText": "Street" } ]
It is possible to specify script field by providing "Params" array:
{ "Name" : "registered_year", "Params": ["doc[\"registered\"].empty ? null : doc[\"registered\"].date.year", "number"] }
First parameter is an inline script expression (ElasticSearch "painless" syntax), and second one determines result type ("number" or "string").
Instead of script you can also specify custom ES bucket aggregation JSON:
{ "Name" : "age_ranges", "Params": ["{\"histogram\":{\"field\":\"age\",\"interval\":5}}"] }
Measures
First measure parameter for types Sum
, Average
, Min
, Max
should refer to the document field:
"Measures": [ { "Type": "Average", "Params": [ "age" ], "LabelText": "Average Age" } ]
Also it is possible to define custom ES metric aggregation type with FirstValue
type:
{ "Type": "FirstValue", "Params": ["{\"extended_stats\":{\"field\":\"balance\"}}", "std_deviation"], "Name": "StdDevOfBalance", "LabelText": "StdDev of Balance" }
where in "Params" first value is an aggregation type JSON, and second value is a name of metric value in ES result (used in case of multi-value metric aggregation).
Parameters
Report parameter is used when you need to specify some filtering condition by user-defined vari able and use it in the "FilterRelex" condition:
"Parameters": [ { "Name": "filterByName", "LabelText": "Name", "DataType": "string" } ]
When parameter is defined it can be used in "FilterRelex" as following:
"1"="1" @filterByName[ and name.keyword="{0}":var ]
@filterByName[ ]
defines parameter placeholder: everything between [ ]
is added to the condition
expression only if parameter is defined. "{0}":var
refers to use parameter value.