Query and filter data cube OLAP operations PivotData C# examples


PivotData provides simple .NET API to make OLAP cube queries in LINQ-style with SliceQuery class: it accepts any IPivotData implementation and produces a new in-memory cube as result of OLAP operation.

Complete example code is here: NReco.PivotData github repository

Filter data cube slice and dice

PivotData pvtData; // lets assume that cube has dimensions: "product", "country", "year"
var whereQuery = new SliceQuery(pvtData)
    .Where(
         "product",  // dimension to filter 
         "Product1", "Product2"  // several values for IN match
    )
    .Where(
         "year",
         2015  // note: value should match actual dimension key type
    );
var slicedPvtData = whereQuery.Execute();  // resulted cube is filtered by product and year

Where method has overloads that accept predicate delegate for complex filtering conditions:

var whereQuery = new SliceQuery(pvtData).Where(
  "product",  // dimension to filter
  (dimKey) => ((string)dimKey).StartsWith("Product")
);

Select specific dimensions roll-up

PivotData pvtData; // lets assume that cube has dimensions: "product", "country", "city"
var selectQuery = new SliceQuery(pvtData).Dimension("product").Dimension("city");
var resultPvtData = selectQuery.Execute(); // resulted cube has only "product" and "city" dimensions

Dimension method has an overload that allows you to define "derived" dimension (that calculated from dimension keys of the source cube). For example, you can calculate quarter (Q1-Q4) by the month or date dimension.

Define derived dimension expand hierarchy

It is possible to define derived dimension (= calculated from existing dimension key or keys) in the following way:

var pvtData = new PivotData(new[]{ "creation_date", ... }, ... );
var byMonthCube = new SliceQuery(pvtData).Dimension("creation_date_month",
	(dimKeys) => {  // array of entry dimension keys
		var creationDateValue = (DateTime)dimKeys[0]; // #0 - index of "creation_date" dimension
		return creationDateValue.Month;
	}
);
var byYearAndQuarter = new SliceQuery(pvtData).Dimension("creation_date_year_and_quarter",
	(dimKeys) => {
		var creationDateValue = (DateTime)dimKeys[0]; // #0 - index of "creation_date" dimension
		return String.Format("{0} Q{1}", 
			creationDateValue.Year, GetQuarter(creationDateValue.Month) );
	}
);

This approach may be used for resolving lookup values by ID-based dimensions.

Select specific measures

If data cube has multiple measures (configured with CompositeAggregatorFactory) it is possible to select only specific metric with Measure method:

PivotData pvtData; // lets assume that cube has 3 measures
var oneMeasurePvtData = new SliceQuery(pvtData).Measure(0).Execute(); // take only measure at index #0

Measure method may be called several times; it is possible to specify the same index more than once to duplicate a measure; This might be usedful if pivot table should display the same metric with different formatting (say, absolute value + percentage or difference value).

Define derived measure

Measure method can be used for defining derived measures (projected from another measures). The following code snipper illustrates how to calculate new metric by existing measures:

var derivedAggrQuery = new SliceQuery(pvtData)
	.Measure(
		new SumAggregatorFactory("delta_fld"),  // factory for new metric
		(aggr) => {
			var compositeAggr = aggr.AsComposite();
			var aggr0val = Convert.ToDecimal( compositeAggr.Aggregators[0].Value );
			var aggr1val = Convert.ToDecimal( compositeAggr.Aggregators[1].Value );
			var diffVal = aggr0val - aggr1val;
			
			// result should correspond specified aggregator factory
			return new SumAggregator("delta_fld", 
				// sum aggregator state
				new object[] {
  					aggr.Count, // elements count
					diffVal // new aggr sum value
				});
		}
	);

Derived measure is calculated when SliceQuery is executed.

Define formula measure

Another Measure overload can be used for defining a formula measure with C# expression:
// average item price = sum of total / sum of quantity
var avgItemPriceByYearQuery = new SliceQuery(pvtData)
    .Dimension("year")
    .Measure("Avg item price",
        (aggrArgs) => {
            // value of first argument (from measure #1)
            var sumOfTotal = ConvertHelper.ConvertToDecimal( aggrArgs[0].Value, 0M);  

            // value of second argument (from measure #0)
            var sumOfQuantity = ConvertHelper.ConvertToDecimal( aggrArgs[1].Value, 0M);
 
            if (sumOfQuantity==0)
                return 0M; // prevent div by zero
            return sumOfTotal/sumOfQuantity;
        },
        new int[] { 1, 0 } // indexes of measures for formula arguments
   );

Unlike projected measure, formula value is calculated "on-the-fly" (when value is accessed).
Also it is possible to define formula dynamically with a string expression (see "DynamicFormulaMeasure" example for more details).

Keyword-based filter Toolkit component

CubeKeywordFilter can be used for filtering the cube by simple search-like query string:

IPivotData pvtData;  // data cube instance
var cubeFilter = new CubeKeywordFilter("2015, Product1");
var filteredPvtData = cubeFilter.Filter(pvtData);
Keyword filter tries to find dimension keys that match specified keywords and performs appropriate filtering. The following search syntax is supported:
product If "product" value is present pefroms exact match; otherwise, works like *product*
country=USA Exact match filtering for dimension with "country" in the name (or resolved label)
product="Product 1" Exact match of value with spaces.
product* Match keys that start with "product"
*es Match keys that end with "es"
year:2015 Filter by "2015" is applied to dimension that contains "year" keyword
-2015 Exclude all keys of dimension that has "2015" value (except "2015")
year>2022 Filter keys of a dimension that has "year" in the name and keep only values that are greater than 2022. Possible conditions are: <, >, >=, <=
count>1000 If cube has "Count" measure this excludes all values of that measure that are less than or equal to 1000.
year:2015+quarter:1, year:2015+quarter:2 Filter by: (year=2015 AND quarter=1) OR (year=2015 and quarter=2)
-"year:2015+quarter:1" Exclude all values for year=2015 AND quarter=1


Next section: render HTML pivot table