Skip to main content
Since 1.0.0 Estimate the value at a given percentile, or the percentile rank of a given value, using the t-digest algorithm. This estimation is more memory- and CPU-efficient than an exact calculation using PostgreSQL’s percentile_cont and percentile_disc functions. tdigest is one of two advanced percentile approximation aggregates provided in TimescaleDB Toolkit. It is a space-efficient aggregation, and it provides more accurate estimates at extreme quantiles than traditional methods. tdigest is somewhat dependent on input order. If tdigest is run on the same data arranged in different order, the results should be nearly equal, but they are unlikely to be exact. The other advanced percentile approximation aggregate is uddsketch, which produces stable estimates within a guaranteed relative error. If you aren’t sure which to use, try the default percentile estimation method, percentile_agg. It uses the uddsketch algorithm with some sensible defaults.

Two-step aggregation

This group of functions uses the two-step aggregation pattern. Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function. Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions. The two-step aggregation pattern has several advantages:
  1. More efficient because multiple accessors can reuse the same aggregate
  2. Easier to reason about performance, because aggregation is separate from final computation
  3. Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
  4. Perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result
To learn more, see the blog post on two-step aggregates.

Samples

Aggregate and roll up percentile data to calculate daily percentiles

Create an hourly continuous aggregate that contains a percentile aggregate:
CREATE MATERIALIZED VIEW foo_hourly
WITH (timescaledb.continuous)
AS SELECT
    time_bucket('1 h'::interval, ts) AS bucket,
    tdigest(100, value) AS tdigest
FROM foo
GROUP BY 1;
Use accessors to query directly from the continuous aggregate for hourly data. You can also roll the hourly data up into daily buckets, then calculate approximate percentiles:
SELECT
    time_bucket('1 day'::interval, bucket) AS bucket,
    approx_percentile(0.95, rollup(tdigest)) AS p95,
    approx_percentile(0.99, rollup(tdigest)) AS p99
FROM foo_hourly
GROUP BY 1;

Available functions

Aggregate

  • tdigest(): aggregate data in a t-digest for percentile calculation

Accessors

Rollup

  • rollup(): combine multiple t-digest aggregates