time_bucket_gapfill works similarly to time_bucket, but adds
gapfilling capabilities. The other functions in this group must be used in the
same query as time_bucket_gapfill. They control how missing values are treated.
time_bucket_gapfill must be used as a top-level expression in a query or
subquery. You cannot, for example, nest time_bucket_gapfill in another
function (such as round(time_bucket_gapfill(...))), or cast the result of the
gapfilling call. If you need to cast, you can use time_bucket_gapfill in a
subquery, and let the outer query do the type cast.
Samples
Use time_bucket_gapfill without a gapfilling algorithm
Get the daily average metric value. Use time_bucket_gapfill without specifying
a gapfilling algorithm. This leaves the missing values as NULL:
SELECT time_bucket_gapfill('1 day', time) AS day,
avg(value) as value
FROM metrics
WHERE time > '2021-12-31 00:00:00+00'::timestamptz
AND time < '2022-01-10 00:00:00-00'::timestamptz
GROUP BY day
ORDER BY day desc;
day | value
-----------------------+--------------------
2022-01-09 00:00:00+00 |
2022-01-08 00:00:00+00 | 48.61293155993108
2022-01-07 00:00:00+00 | 54.388267525986485
2022-01-06 00:00:00+00 |
2022-01-05 00:00:00+00 | 58.257520634785266
2022-01-04 00:00:00+00 | 46.09172424261765
2022-01-03 00:00:00+00 | 42.53498707820027
2022-01-02 00:00:00+00 |
2022-01-01 00:00:00+00 | 47.84420001415975
2021-12-31 00:00:00+00 |
(10 rows)
Use time_bucket_gapfill and carry last value forward
Get the daily average metric value. Use locf to carry the last value forward
if a value is missing. Note that avg is nested inside locf, and not the
other way around.
SELECT time_bucket_gapfill('1 day', time) AS day,
locf(avg(value)) as value
FROM metrics
WHERE time > '2021-12-31 00:00:00+00'::timestamptz
AND time < '2022-01-10 00:00:00-00'::timestamptz
GROUP BY day
ORDER BY day desc;
day | value
-----------------------+--------------------
2022-01-09 00:00:00+00 | 48.61293155993108
2022-01-08 00:00:00+00 | 48.61293155993108
2022-01-07 00:00:00+00 | 54.388267525986485
2022-01-06 00:00:00+00 | 58.257520634785266
2022-01-05 00:00:00+00 | 58.257520634785266
2022-01-04 00:00:00+00 | 46.09172424261765
2022-01-03 00:00:00+00 | 42.53498707820027
2022-01-02 00:00:00+00 | 47.84420001415975
2022-01-01 00:00:00+00 | 47.84420001415975
2021-12-31 00:00:00+00 |
(10 rows)
Use time_bucket_gapfill and use linear interpolation
Get the daily average metric value. Use interpolate to linearly interpolate
the value if it is missing. Note that avg is nested inside interpolate.
SELECT time_bucket_gapfill('1 day', time) AS day,
interpolate(avg(value)) as value
FROM metrics
WHERE time > '2021-12-31 00:00:00+00'::timestamptz
AND time < '2022-01-10 00:00:00-00'::timestamptz
GROUP BY day
ORDER BY day desc;
day | value
-----------------------+--------------------
2022-01-09 00:00:00+00 |
2022-01-08 00:00:00+00 | 48.61293155993108
2022-01-07 00:00:00+00 | 54.388267525986485
2022-01-06 00:00:00+00 | 56.32289408038588
2022-01-05 00:00:00+00 | 58.257520634785266
2022-01-04 00:00:00+00 | 46.09172424261765
2022-01-03 00:00:00+00 | 42.53498707820027
2022-01-02 00:00:00+00 | 45.189593546180014
2022-01-01 00:00:00+00 | 47.84420001415975
2021-12-31 00:00:00+00 |
(10 rows)
Use time_bucket_gapfill with a timezone argument
Get the daily average metric value, using Europe/Berlin as the timezone. Note
that daily time buckets now start at 23:00 UTC, which is equivalent to
midnight in Berlin for the selected dates:
SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') AS day,
interpolate(avg(value)) as value
FROM metrics
WHERE time > '2021-12-31 00:00:00+00'::timestamptz
AND time < '2022-01-10 00:00:00-00'::timestamptz
GROUP BY day
ORDER BY day desc;
day | value
-----------------------+--------------------
2022-01-09 23:00:00+00 |
2022-01-08 23:00:00+00 | 48.65079127913703
2022-01-07 23:00:00+00 | 47.31847777099154
2022-01-06 23:00:00+00 | 55.98845740343859
2022-01-05 23:00:00+00 | 55.61667401777108
2022-01-04 23:00:00+00 | 58.74115574522012
2022-01-03 23:00:00+00 | 45.77993635988273
2022-01-02 23:00:00+00 | 41.78689923453202
2022-01-01 23:00:00+00 | 24.324313477743974
2021-12-31 23:00:00+00 | 48.86680377661261
2021-12-30 23:00:00+00 |
(11 rows)
Available functions
Bucket function
Interpolators
locf(): fill in missing values by carrying the last observed value forward
interpolate(): fill in missing values by linear interpolation