如何比较两个字典并找到匹配的值

2024-05-03 • 问答

我正在从气象系统的API中提取数据。该API返回单个JSON对象，其中的传感器分为每个传感器两个子节点。我正在尝试将两个（或更多）传感器与它们的时间戳关联。不幸的是，并不是每个传感器都轮询一次（尽管应该这样做）。

实际上，我有一个看起来像这样的JSON对象：

{
    "sensor_data": {
        "mbar": [{
            "value": 1012,"timestamp": "2019-10-31T00:15:00"
        },{
            "value": 1011,"timestamp": "2019-10-31T00:30:00"
        },{
            "value": 1010,"timestamp": "2019-10-31T00:45:00"
        }],"temperature": [{
            "value": 10.3,{
            "value": 10.2,{
            "value": 10.0,"timestamp": "2019-10-31T00:45:00"
        },{
            "value": 9.8,"timestamp": "2019-10-31T01:00:00"
        }]
    }
}

这个例子表明我还有一个额外的温度读数，这个例子很小。

如何获取这些数据并为每个时间戳关联一个读数，以收集尽可能多的传感器数据（从匹配的时间戳中提取）？最终，我想将数据导出到CSV文件中，每一行代表来自传感器的时间片，以图形化或进一步分析。

对于长度完全相同的列表，我有一个解决方案：

sensor_id = '007_OHMSS'
sensor_data = read_json('sensor_data.json') # wrapper function for open and load json
list_a = sensor_data['mbar']
list_b = sensor_data['temperature']

pair_perfect_sensor_list(sensor_id,list_a,list_b)
def pair_perfect_sensor_lists(sensor_id,list_b):
    # in this case,list a will be mbar,list_b will be temperature
    matches = list()
    if len(list_a) == len(list_b):
        for idx,reading in enumerate(list_a):
            mbar_value = reading['value']
            timestamp = reading['timestamp']
            t_reading = list_b[idx]
            t_time = t_reading['timestamp']
            temp_value = t_reading['value']
            print(t_time == timestamp)

            if t_time == timestamp:
                match = {
                    'sensor_id': sensor_id,'mbar_index': idx,'time_index': idx,'mbar_value': mbar_value,'temp_value': temp_value,'mbar_time': timestamp,'temp_time': t_time,}
                print('here is your match:')
                print(match)
                matches.append(match)
            else:
                print("IMPERFECT!")
                print(t_time)
                print(timestamp)
        return matches
    return failure

当没有匹配项时，我想跳过缺少的传感器的读数（在这种情况下，是最后的mbar读数），只是不适用。

在大多数情况下，偏移量仅为一个节点-意味着temp在中间某处具有一个额外的读数。

我正在使用idx索引来优化进程的速度，因此我不必遍历第二（或第三或第n个）命令来查看其中是否存在时间戳，但是我知道并非如此还是首选，因为没有命令。在这种情况下，似乎每个子节点传感器dict都是按时间戳排序的，所以我试图利用这种便利。

这是常见问题吗？如果是这样，请给我指出术语。但是我已经搜索过了，除了“遍历每个子字典并寻找匹配项”之外，找不到合理，有效的答案。

欢迎任何想法，因为我必须经常在大型（25 MB文件或更大）JSON对象上执行此操作。完整的转储已结束，并且超过300 MB，但是我已经按传感器ID对其进行了切片，因此它们更易于管理。

st=yourjsonabove mbar={} for item in st['sensor_data']['mbar']: mbar[item['timestamp']] = item['value'] temperature={} for item in st['sensor_data']['temperature']: temperature[item['timestamp']] = item['value'] for timestamp in temperature: print("Timestamp:",timestamp,"Sensor Reading: ",mbar.get(timestamp),"Temperature Reading: ",temperature[timestamp])

Timestamp: 2019-10-31T00:15:00 Sensor Reading: 1012 Temperature Reading: 10.3 Timestamp: 2019-10-31T00:30:00 Sensor Reading: 1011 Temperature Reading: 10.2 Timestamp: 2019-10-31T00:45:00 Sensor Reading: 1010 Temperature Reading: 10.0 Timestamp: 2019-10-31T01:00:00 Sensor Reading: None Temperature Reading: 9.8

如何比较两个字典并找到匹配的值

leoyyc1987 回答：如何比较两个字典并找到匹配的值

大家都在问