- freq_agg: Get the most common elements and their relative frequency using the SpaceSaving algorithm
- count_min_sketch: Estimate the absolute number of times a specific value appears using the count-min sketch data structure
Two-step aggregation
This group of functions uses the two-step aggregation pattern. Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function. Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions. The two-step aggregation pattern has several advantages:- More efficient because multiple accessors can reuse the same aggregate
- Easier to reason about performance, because aggregation is separate from final computation
- Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
- Perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result
Samples
Find the most common values
Get the 5 most common values from a dataset:Get frequency information for common values
Return values that represent more than 5% of the input, along with their frequency bounds:Estimate absolute counts
Use count-min sketch to estimate how many times specific values appear:Available functions
Frequency aggregation
freq_agg(): track the most common values using a minimum frequency cutoff
Count-min sketch
count_min_sketch(): estimate absolute counts using the count-min sketch data structure