PivotData Tool utility for pivot tables generation by CSV/TSV files or SQL database
FREE Download
Need to create pivot table reports from C# code? Check out PivotData Toolkit for .NET.
PivotData is a command line utility that aggregates data from CSV/TSV file or SQL database and
generates pretty pivot table reports (HTML/Excel/CSV/PDF) without OLAP server or Excel.
It it good for automated and scheduled crosstab reports generation.
Pivotdata tool usage doesn't need any programming skills.
Pivotdata tool usage doesn't need any programming skills.
Installation
- Windows: special installation procedure is not needed; pivotdata.exe is a .NET console utility that requires Microsoft .NET Framework 4.0 (or higher). In most cases it is already installed if you have Win7SP1/Win8.
- Linux/MacOS: please contact us if you interested in PivotData Tool version for these platforms
- Web:
- online pivot table builder: SeekTable.com
- embed pivot tables into any web application: PivotData Microservice
Features
PivotData Tool can:- aggregate large datasets that Excel cannot handle (csv files >1gb are processed in several minutes)
- load data directly from MS SQL (or any other DB that has ODBC driver)
- load pre-aggregated data (for example result of SQL GROUP BY query)
- parse input values (dates, numbers, custom regex rules)
- calculate derived values and formulas
- perform data cube slicing and filtering
- save data cube for fast multiple reports generation
- generate 2-dimensional pivot table reports (HTML, CSV, Excel; PDF may be produced by HTML output with wkhtmltopdf)
- upload data cube to URL
Synopsis command line options
pivotdata -i inputHandler inputOptions [dataOptions] -o outputHandler outputOptionsAny pivotdata task performs two basic steps:
- build in-memory data cube structure (result of data aggregation or loading previously aggregated results)
- do something with data cube and produce some valuable output (like pivot table report generation)
______ ________ _________ ________ ______ | | |applying| |PivotData| |applying| | | |input | --> |mapping | --> |in-memory| --> |slicing | --> |output| |source| |handlers| | cube | |filters | |result| |______| |________| |_________| |________| |______|
Input Handlers input sources for building in-memory data cube
List of supported input handlers (-i
option):
-i cubefile <cube_file>
- load data cube from serialized data file (pair of .cfg and .dat files).
Input file may have extension (.cfg) or be without extension at all (in this case .cfg is appended automatically).
For example:
pivotdata -i cubefile myCube.cfg
This handler doesn't need cube configuration option (-p) because it is already present in .cfg file; mapping handlers (-m
) are also not applicable with "cubefile" input handler. -i csv {csv_file} [{csv_options_json}] -p {pivot_data_cube_config_json}
- aggregate data from csv file
{csv_file}
is a path to the local file or "-" (read csv data from stdin)
{csv_options_json}
(optional) is a JSON object with CSV reader settings:- Delimiter
- string used as separator in CSV file (optional, ',' by default)
- HasHeaderRecord
- true|false (optional, true by default)
- Headers
- array of CSV column headers (optional, used when CSV doesn't have header row)
- IgnoreBlankLines
- true|false (optional, true by default)
- TrimHeaders
- true|false (optional, false by default)
- RowsLimit
- integer: max number of rows to read (optional). Useful for testing purposes with large input files.
- Encoding
- encoding name (optional, UTF-8 by default). Possible values are returned by Encoding.GetEncodings method.
-p {pivot_data_cube_config_json}
is required for 'csv' input handler.
Example:pivotdata -i csv myData.csv "{Delimiter:';'}" -p "{Dimensions:['Contact Name', 'Amount', 'Pay Date'],Aggregators:[{Name:'Sum',Params:['Amount']}]}"
-i <mssql|odbc> {db_source_options_json} -p {pivot_data_cube_config_json}
-
aggregate data returned by SELECT using MSSQL/AzureSQL (
mssql
) or ODBC (odbc
) driver.
Cube configuration-p {pivot_data_cube_config_json}
is required for this input handler.{db_source_options_json}
is a JSON object with DB reader settings:- Connection
- DB connection string
- SelectSql
- SELECT statement that returns data to aggregate
pivotdata -i mssql "{Connection:'Data Source=localhost\\SQLEXPRESS;Database=AdventureWorks2014;User=usr;password=pwd;',SelectSql:'select ProductID,OrderQty,UnitPrice from Sales.SalesOrderDetail'}" -p "{Dimensions:['ProductID','OrderQty'],Aggregators:[{Name:'Sum',Params:['UnitPrice']}]}" -o cubefile AdvWorksSalesCube
Data Options data mapping rules, slicing and filtering
-p {pivot_data_cube_config_json}
-
Define multidimensional dataset configuration for input data source. Described by JSON object:
- Dimensions
- array of column names from csv file like: ['Contact Name', 'Amount', 'Pay Date']
- Aggregators
- array of data cube value aggregators. Supported aggregators:
{Name:'Count'} Number of matched rows {Name:'Sum',Params:['Salary']} Sum of 'Salary' column values {Name:'Average',Params:['Salary']} Average of 'Salary' column values {Name:'Min',Params:['Salary']} Minimum of 'Salary' column values {Name:'Max',Params:['Salary']} Maximum of 'Salary' column values {Name:'CountUnique',Params:['Contact Name']} Number of unique 'Contact Name' column values
-m {mapping_config_json}
-
Apply value mapping rules for each row from input data source. This handler can be used for
parsing string values (dates, numbers) and defining derived values (year/month/quarter/day date parts,
formatting values). Mapping is defined by JSON object "column_name"=>{mapping handler descriptor}, for example:
{ 'create_date' : {Type:'datetime'} }
where 'create_date' is a column that should be parsed as DateTime value.
List of supported mapping handlers:{Type:'datetime'} Parses input string as DateTime value {Type:'integer'} Parses input string as integer value {Type:'double'} Parses input string as floating-point value {Type:'decimal'} Parses input string as decimal value {Type:'format',Params:['Year: {0}', 'year_column_name' ]} Formats string value; first parameter is a format specifier (acceptable by .NET String.Format) and rest of parameters are column names used for getting format arguments. It is possible to use more than 1 argument. {Type:'getyear',Params:['date_column']} Returns year component of the specified column's date {Type:'getmonthnumber',Params:['date_column']} Returns month component of the specified column's date {Type:'getquarter',Params:['date_column']} Returns quarter (1-4) component of the specified column's date {Type:'getdaynumber',Params:['date_column']} Returns day component of the specified column's date {Type:'getdateonly',Params:['datetime_column']} Returns date component (ignore time part) of the specified column's datetime {Type:'regexismatch',Params:['regular_expression','input_value_column']} Returns test result of .NET regular expression match (true or false) {Type:'regexmatch',Params:['regular_expression','input_value_column']} Returns matched value of .NET regular expression . To return group value use name "value", for example: (?<value>.*)
{Type:'regexreplace',Params:['regular_expression','replacement','input_value_column']} Returns result of .NET regular expression replace (string) {Type:'formula',Params:['expression']} Returns evaluation result of lambda expression. Row values could be accessed as Data["column"]
orRow["column"]
{Type:'lowercase',Params:['input_value_column']} Returns lowercase value. pivotdata -i {input_source} -m "{'OrderDate':{Type:'datetime'},'Total':{Type:'double'}}"
Note that it is possible to apply several mapping rules at once:pivotdata -i {input_source} -m "{'OrderDate':{Type:'datetime'},'Total':{Type:'double'}}" -m "{'OrderYear':{Type:'getyear',Params:['OrderDate']}}"
(in this case mapping handlers are applied in the same order as they specified) -s {slice_query_json}
- Transform data cube according to the specified query.
{slice_query_json} is a JSON object:
- Dimensions
- array of dimensions to select
- AggregatorIndexes
- array of aggregator indexes to select
- Filter
- formula expression (string) for filtering datapoints
- DerivedDimensions
- dimension name → formula expression map (supports merging dimension keys by regex rule or Damerau-Levenshtein distance criteria -- useful for handling data with typos)
-s "{Dimensions:['Date Year'],AggregatorIndexes:[1], Filter:'Dimensions[""Date Year""]==2015'}"
-s "{DerivedDimensions: {'Model Family': 'Merge.ByRegex(Dimensions[""model""], ""(Model_Z1|Model_Z2)"", ""Model Z"")' }"
-s "{DerivedDimensions: {'Status': 'Merge.ByDamerauLevenshteinDistance(Dimensions[""status""], new[] {""open"", ""closed"", ""in progress""})' }"
-g <count field name>
- Load input data as pre-grouped cube datapoints. This option is not applicable with '-i cubefile'.
<count field name>
identifies a column with count aggregate value (required).
Each input row should unique define value key and aggregated value. This option is useful for aggregating cube data on DB side with GROUP BY:pivotdata -i mssql "{Connection:'Data Source=localhost\\SQLEXPRESS;Database=AdventureWorks2014;User=usr;password=pwd;',SelectSql:'select ProductID,OrderQty,SUM(UnitPrice) as UnitPrice,COUNT(UnitPrice) as cnt from Sales.SalesOrderDetail group by ProductID,OrderQty'}" -p "{Dimensions:['ProductID','OrderQty'],Aggregators:[{Name:'Sum',Params:['UnitPrice']}]}" -g cnt -o cubefile AdvWorksAggrOnDbSide
-t {pivot_table_report_config_json}
-
Define pivot table report configuration.
{pivot_table_report_config_json} is a JSON object:
- Rows
- array of dimensions that should be used for building rows of the pivot table
- Columns
- array of dimensions that should be used for building columns of the pivot table
- AggregatorIndex
- (optional) use aggregator at specified index for pivot table values. Applicable only if data cube has more than one aggregator.
Output Handlers results produced from in-memory data cube
List of supported output handlers (-o
option):
-o cubefile <output_file>
- write aggregated data to the {output_file}. Actually 2 files are created: .cfg with data cube configuration (json) and .dat with binary data.
For example:
pivotdata -i {input_source_options} -o cubefile Payments_Aggregated
-o uploadpackage {config_json}
-
compose compressed cube package (.zip file) and upload it to the specified HTTP endpoint. {config_json} is a JSON object with
the following settings:
- Name
- Unique cube name (used as file name for zip package)
- UploadUrl
- HTTP or HTTPS endpoint URL
- Username
- HTTP Basic Authentication username (optional)
- Password
- HTTP Basic Authentication password (optional)
-o getvalue {value_key_json}
- write aggregated value for specified dimension keys to stdout. {value_key_json} is a JSON array of dimension keys, for example:
pivotdata -i {input_source_options} -o getvalue "[2007,5]"
Totals value can be obtained by specifying 'null' key for some or all dimensions. Example of getting grand total value:pivotdata -i {input_source_options} -o getvalue "[null,null]"
-o pivottablehtml <output_file>
-
generate pivot table report and render it as HTML table.
Pivot table configuration (
-t option
) is required for this output handler.
<output_file>
is a path to local file or "-" (write to stdout)
Example:pivotdata -i {input_source_options} -t "{Rows:['Year', 'Quarter'],Columns:['Ethnicity']}" -o pivottablehtml -
pivotdata -i {input_source_options} -t "{Rows:['Year'],Columns:[]}" -o pivottablehtml MyPivotTableReport.html
-o pivottablecsv <output_file> [{csv_opts_json}]
-
generate pivot table report in CSV format.
Pivot table configuration (
-t option
) is required for this output handler.
<output_file>
is a path to local file or "-" (write to stdout)
Optional CSV settings{csv_opts_json}
(JSON object):- CsvDelimiter
- string used as CSV values separator
- TotalsRow
- (true|false) if true totals column is renderered (default)
- TotalsColumn
- (true|false) if true totals row is renderered (default)
- GrandTotal
- (true|false) if true grand total is rendered (default)
- RenderRowHeaderColumn
- (true|false) if true row keys are rendered as csv values (default)
pivotdata -i {input_source_options} -t "{Rows:['Year'],Columns:[]}" -o pivottablecsv report.csv "{CsvDelimiter:';'}"
-o pivottableexcel <output_file> [{excel_opts_json}]
-
generate pivot table report in Excel format.
Pivot table configuration (
-t option
) is required for this output handler.
<output_file>
is a path to local file or "-" (write to stdout)
Optional Excel settings{excel_opts_json}
(JSON object):- WorkSheetName
- title of Excel Worksheet with pivot report data
pivotdata -i {input_source_options} -t "{Rows:['Year'],Columns:[]}" -o pivottableexcel report.xlsx "{WorkSheetName:'Sales 2015'}"
Global Options
-h
- Display help
--silent
- Be less verbose: do not show info messages
--debug
- When error occurs show full exception stacktrace (useful for bugreports)
--setculture <cultureName>
- Explicitly set culture used for parsing input values (dates,numbers). For example:
en-GB can parse dates in format: dd/mm/YYYY en-US can parse dates in format: mm/dd/YYYY de-DE can parse dates in format: dd.mm.YYYY
Licensing Conditions
PivotData Tool can be used for FREE by end-users and inside companies for internal business purposes (like server-side automations). If you want to externally redistribute it as part of your own solution/product (ISV) you need to purchase a commercial license.