在 PostgreSQL 数据库中,我有一个测量表,如下所示:
| sensor_group_id | ts | value_1 | value_2 | etc... |
|-----------------|---------------------------|---------|---------|--------|
| 1 | 2021-07-21T00:20:00+00:00 | 15 | NULL | |
| 1 | 2021-07-15T00:20:00+00:00 | NULL | 23 | |
| 2 | 2021-07-17T00:20:00+00:00 | NULL | 11 | |
| 1 | 2021-07-13T00:20:00+00:00 | 9 | 4 | |
| 2 | 2021-07-10T00:20:00+00:00 | 99 | 36 | |
此表中有许多列具有不同类型的度量。每个传感器组同时生成不同类型的测量值,但并不总是所有类型。 所以我们最终得到了部分填充的行。
我想做什么:
- 对于每个不同的 sensor_group_id
- 对于每个不同的列(测量类型)
- 获取该列为 NOT NULL 时的最新时间戳以及该时间戳时该度量的值
我现在的解决方案看起来很麻烦:
WITH
latest_value_1 AS (SELECT DISTINCT ON (sensor_group_id) sensor_group_id,ts,value_1
FROM measurements
WHERE value_1 IS NOT NULL
ORDER BY sensor_group_id,ts DESC),latest_value_2 AS (SELECT DISTINCT ON (sensor_group_id) sensor_group_id,value_2
FROM measurements
WHERE value_2 IS NOT NULL
ORDER BY sensor_group_id,latest_value_3 AS (SELECT DISTINCT ON (sensor_group_id) sensor_group_id,value_3
FROM measurements
WHERE value_3 IS NOT NULL
ORDER BY sensor_group_id,etc...
SELECT latest_value_1.sensor_group_id,latest_value_1.ts AS latest_value_1_ts,value_1,latest_value_2.ts AS latest_value_2_ts,value_2,latest_value_3.ts AS latest_value_3_ts,value_3,etc...
FROM lastest_value_1
JOIN latest_value_2
ON latest_value_1.sensor_group_id = latest_value_2.sensor_group_id
JOIN latest_value_2
ON latest_value_1.sensor_group_id = latest_value_2.sensor_group_id
JOIN latest_value_3
ON latest_value_1.sensor_group_id = latest_value_3.sensor_group_id
etc...
这会产生以下结果:
sensor_group_id | latest_value_1_ts | value_1 | latest_value_2_ts | value_2 | 等等... |
---|---|---|---|---|---|
1 | 2021-07-21T00:20:00+00:00 | 15 | 2021-07-21T00:20:00+00:00 | 23 | |
2 | 2021-07-10T00:20:00+00:00 | 99 | 2021-07-17T00:20:00+00:00 | 11 |
这看起来非常复杂,但我不确定是否有更好的方法。将不胜感激!