Skip to main content
Since 1.0.0 Aggregate data in a uddsketch for further calculation of percentile estimates. This is the first step for calculating approximate percentiles with the uddsketch algorithm. Use uddsketch to create an intermediate aggregate from your raw data. This intermediate form can then be used by one or more accessors in this group to compute final results. Optionally, multiple such intermediate aggregate objects can be combined using rollup() before an accessor is applied. If you aren’t sure what values to set for size and max_error, try using the alternate aggregate function, percentile_agg(). percentile_agg also creates a UddSketch, but it sets sensible default values for size and max_error that should work for many use cases.
uddsketch(
    size INTEGER,
    max_error DOUBLE PRECISION,
    value DOUBLE PRECISION
) RETURNS UddSketch

Arguments

NameTypeDefaultRequiredDescription
sizeINTEGER-maximum number of buckets in the uddsketch. Providing a larger value here makes it more likely that the aggregate is able to maintain the desired error, but potentially increases the memory usage
max_errorDOUBLE PRECISION-the desired maximum relative error of the sketch. The true error may exceed this if too few buckets are provided for the data distribution. You can get the true error using the error function
valueDOUBLE PRECISION-the column to aggregate for further calculation

Returns

ColumnTypeDescription
uddsketchUddSketcha percentile estimator object created to calculate percentiles using the uddsketch algorithm

Samples

Build a uddsketch using a column called data from a table called samples. Use a maximum of 100 buckets and a relative error of 0.01.
SELECT uddsketch(100, 0.01, data) FROM samples;