我有一个10+百万个元素的向量。
我需要找到满足给定条件A的所有元素(例如,id loaded_date data data_json
1 2019-10-25 Same data as before {"collection": {"row": {"field": [{"-name": "Item Key","-type": "text","-value": "Haircolour - Avstemming kunder - OMT"},{"-name": "Created","-type": "datetime","-value": "2019-10-25 17:35:17Z"},{"-name": "Type","-value": "Session Provisioning Failure"}]}}}
2 2019-10-25 Same data as before {"collection": {"row": {"field": [{"-name": "Item Key","-value": "2019-10-25 17:51:32Z"},"-value": "Session Provisioning Failure"}]}}}
3 2019-02-23 Same data as before {"collection": {"row": {"field": [{"-name": "Item Key","-value": "Haircolour - Hent klienter til kø"},{"-name": "Last Generation Time","-value": "2019-02-23 11:00:36Z"},{"-name": "Priority","-type": "number","-value": "-3"}]}}}
行的X i %in% c(6,10),X
例如,给定以下X列,我希望最终结果为i %in% c(8:10) and c(5:6)
列。如果B不在满足A的元素之前,我对B为真的元素不感兴趣,因此行flag2
具有i == 2
。
flag2 == 0
产生flag1的第一个操作非常简单且非常快速:
i | X | flag1 | flag2
---------------------------
1 | 4 | 0 | 0
2 | 3 | 0 | 0
3 | 6 | 0 | 0
4 | 9 | 0 | 0
5 | 3 | 0 | 1
6 | 1 | 1 | 1
7 | 9 | 0 | 0
8 | 3 | 0 | 1
9 | 2 | 0 | 1
10 | 1 | 1 | 1
我用以下for循环实现了第二个操作,它给出了所需的结果,但是给定的数据量却非常耗时。
# locate all occurrences of X < 2
my_data$flag1 = dplyr::case_when(my_data$X < 2 ~ 1,T ~ 0)
有什么办法可以更有效地做到这一点?