DuckDB pivot tables Data Source Setup


All necessary DuckDB binaries are already included into nreco/pivotdataservice docker image. PivotDataService zip doesn't include DuckDB native binaries and they should be downloaded for your hosting platform separately. Please contact us to get step-by-step instructions.

DuckDB is a powerful, in-process (built-in) database management system designed for high-performance analytics. DuckDB is good at handling complex queries on large datasets, it supports various columnar storage formats and parallel execution capabilities and offers a fast, reliable, and user-friendly solution for data analysis and manipulation.

PivotDataService can use DuckDB a data source with SQL-compatible database connector:

{
  "Id": "DuckDB_DS1",
  "Name": "DuckDB DataSource1",
  "SourceType": "SqlDb",
  "SourceDb": {
    "Connector": "duckdb",
    "ConnectionString": "DataSource=:memory:?cache=shared;",
    "SelectSql": "select * from read_csv('https://www.seektable.com/demo/sales.csv')"
  },
  "InferSchema": true
}

With DuckDB connector you can use SQL to query:

  • Large CSV/JSON/Parquet/Iceberg files (including multiple files at once) that are stored either locally or by URL or in the cloud storage (S3).
  • Local DuckDB columnar data files that can be used as serverless data warehouse that 'lives' inside PivotDataService.
  • MySql/PostgreSql servers (via DuckDB extensions)
  • any data source supported by PivotDataService via special cube_query (see below).

Note that with DuckDB you can combine data from a wide variety of different sources.