Distribution analysis overview

Distribution analysis functions help you understand how data is distributed across your datasets and perform fast approximate row counting on large tables.

Samples

Histogram distribution

Create a histogram showing the distribution of battery levels across devices:

SELECT device_id, histogram(battery_level, 20, 60, 5)
FROM readings
GROUP BY device_id
LIMIT 10;

The histogram partitions values into buckets between 20 and 60, with 5 equal-width buckets. The result includes an underflow bucket (values < 20) and an overflow bucket (values >= 60).

Approximate row count

Get a fast approximate count of rows in a without a full table scan:

ANALYZE conditions;

SELECT * FROM approximate_row_count('conditions');

This uses database statistics to provide a quick estimate, which is particularly useful for very large tables where exact counts would be expensive.

Available functions