Load pre-aggregated data PivotData C# Examples
Normally PivotData
class performs input data aggregation by itself (with ProcessData method), but in some
cases data cube should be populated with existing values:
- when data aggregation is performed on database level (with SQL GROUP BY, MongoDb aggregation pipleline, ElasicSearch aggregations etc)
- metrics are loaded from OLAP server, or provided by 3rd party API, or pre-calculated in some other way
It is possible to load pre-aggregated data into PivotData
with help of
GroupedSourceReader
component:
class MyModel { public string Product { get; set; } public string Country { get; set; } public int OrdersCount { get; set; } public int Qty_Min { get; set; } public int Qty_Average { get; set; } } IEnumerable<MyModel> inputData = ... ; var groupedPvtDataReader = new GroupedSourceReader( (processData) => { processData(inputData, new ObjectMember().GetValue); }, "OrdersCount" // field name with aggregated rows count ); var pvtDataFactory = new PivotDataFactory(); var pvtData = groupedPvtDataReader.Read( // instead of using "PivotDataFactory.GetConfiguration" method // you can compose and pass "PivotDataConfiguration" directly pvtDataFactory.GetConfiguration( new PivotData(new[] {"Product", "Country"}, new CompositeAggregatorFactory( new CountAggregatorFactory(), new AverageAggregatorFactory("Qty") ) ) ), pvtDataFactory);
GroupedSourceReader
class has a constructor that accepts IPivotDataSource
implementations:
- For SQL data source
DbCommandSource
: see "ToolkitSqlDbSource" example - For CSV data
CsvSource
, JSON dataJsonSource
- For MongoDb data source: see "ToolkitMongoDbSource" example
- For SSAS OLAP server data source: see "ToolkitAdomdSource" example
NOTE: field name for "count" should be provided even if you don't use CountAggregatorFactory
in the PivotDataConfiguration
provided to GroupedSourceReader.Read
. If your input data doesn't have this value
you can resolve it to "1":
var groupedPvtDataReader = new GroupedSourceReader( (processData) => { var objMemberAccessor = new ObjectMember(); processData(query, (row, field) => { if (field=="Count") return 1; return objMemberAccessor.GetValue(row, field); }); }, "Count");
By default GroupedSourceReader
can load the following aggregator types:
Aggregator Name | Aggregator Factory Type | Required Fields |
---|---|---|
Count | CountAggregatorFactory |
"[count]" (integer) |
Sum | SumAggregatorFactory |
"[field]_Sum" (number) |
Average | AverageAggregatorFactory |
"[field]_Average" (number) |
Min | MinAggregatorFactory |
"[field]_Min" (any comparable type) |
Max | MaxAggregatorFactory |
"[field]_Max" (any comparable type) |
ListUnique | ListUniqueAggregatorFactory |
"[field]_List" (array of values) |
CountUnique | ListUniqueAggregatorFactory |
"[field]_CountUnique" (array of values) |
Variance | VarianceAggregatorFactory |
"[field]_Average" (number) "[field]_Variance" (number) |
This setup may be changed by manipulations with GroupedSourceReader.AggregatorStateComposers
collection.
Generic implementations of GroupedSourceReader.IAggregatorStateComposer
could be used for registering custom aggregator type:
GroupedSourceReader.SingleAggregatorStateComposer
for aggregators with single-object state (likeCountAggregator
)-
GroupedSourceReader.ArrayAggregatorStateComposer
for aggregators with array-based state (likeSumAggregator
)
How to keep dimension keys order
By default GroupedSourceReader.Read
method returns PivotData
instance that doesn't preserve order of
dimensions keys by input data; PivotTable
created by this cube will be ordered A-Z.
In some cases original input data order should be preserved.
This is possible with help of
FixedPivotData
class
(read-only IPivotData
implementation) and GroupedSourceReader.ReadState
method:
IPivotData pvtData = new PivotData(new[] {"Product", "Country"}, new CountAggregatorFactory() ); var pvtDataState = groupedPvtDataReader.ReadState( pvtDataFactory.GetConfiguration(pvtData) ); pvtData = new FixedPivotData(pvtData.Dimensions, pvtData.AggregatorFactory, pvtDataState );
PivotTable
applies A-Z order by default and special override is needed to prevent that and keep keys order as in input data:
public class NoSortPivotTable : PivotTable { public NoSortPivotTable(string[] rows, string[] columns, IPivotData pvtData) : base(rows,columns,pvtData) { } protected override void SortKeys(ValueKey[] keys, string[] dimensions, IComparer<ValueKey> comparer) { /* do nothing */ } } // create pivot table that preserves order of input data var pvtTbl = new NoSortPivotTable(new[]{"Product"}, new[]{"Country"}, pvtData);
How to register additional aggregators for GroupedSourceReader
The following code snippet illustrates how to register "CountUnique" aggregator for input dataset that contains list of unique values:
// for test purposes .NET collection is used as a data source // element object properties are used as fields referenced by dimensions/aggregator configuration var inputData = new object[] { new { dim1 = "A", dim2 = "B", count = 3, col_CountUnique = new string[] {"V1","V2"} } // field with values for CountUniqueAggregator should have array type }; var grpSourceRdr = new GroupedSourceReader( processData => processData(inputData, new ObjectMember().GetValue), "count"); // lets register "CountUnique" for GroupedSourceReader // state object for "CountUniqueAggregator" is an array with 2 elements // 1) count value (number of aggregated rows) // 2) array of unique values grpSourceRdr.AggregatorStateComposers.Add( new GroupedSourceReader.ArrayAggregatorStateComposer( "CountUnique", // matches AggregatorFactoryConfiguration.Name new[] { "count", // field for "count" value "{0}_CountUnique" // field name for unique values array. {0} = first aggregator factory parameter }, new[] { typeof(uint), // field type for "count" typeof(object[]) // field type for unique values })); var pvtDataFactory = new PivotDataFactory(); // custom aggregator should be registered in PivotDataFactory with RegisterAggregator method // no needed to do this for "CountUnique" as PivotDataFactory knows about standard aggregators by default var pvtData = grpSourceRdr.Read( new PivotDataConfiguration() { Dimensions = new[] { "dim1", "dim2" }, Aggregators = new[] { new AggregatorFactoryConfiguration("CountUnique", new[]{"col"}) // matches "col_CountUnique" field } }, pvtDataFactory ); Console.WriteLine("R: {0}", pvtData["A","B"].Value); // outputs: 2
NOTE: "CountUnique" is used for sample purposes, you don't need to register it manually as GroupedSourceReader
has a state composer
for all standard aggregators by default.