如何在同一字段中合并字典列表并在过程中求和另一个字段?

尝试通过url字段合并字典列表,如果该列表中有相同的字典项,则将通过该字段合并相同的字典项,同时将另一个字段的总和相加。

我尝试使用'setdefault',但它并不总是能按预期工作。运行循环后,我仍然得到重复的结果。

这是我要与第二个字段的总和相加的字典列表,以在存在相同网址的地方求和:

54
1
0 (0/0)
0

这是我想要得到的结果:

[
  ['https://www.website.com/directory/link-1',21,'Long Text Field 1','String 1',{'url': 'https://www.website.com/images/image-1.jpg'},255],['https://www.website.com/directory/link-1',185,['https://www.website.com/directory/link-2',296,'Long Text Field 2','String 2',{'url': 'https://www.website.com/images/image-2.jpg'},303],['https://www.website.com/directory/link-3',354,'Long Text Field 3','String 3',{'url': 'https://www.website.com/images/image-3.jpg'},388],['https://www.website.com/directory/link-4',606,'Long Text Field 4','String 4',{'url': 'https://www.website.com/images/image-4.jpg'},624]
]

我正在尝试

[
 ['https://www.website.com/directory/link-1',206,624]
]
shaliang8 回答:如何在同一字段中合并字典列表并在过程中求和另一个字段?

您可以尝试以下方法。它基本上通过第一个URL将lst中的子列表分组为列表的默认字典,然后仅将第二个项目号相加来构建新结果。

from collections import defaultdict
from pprint import pprint

lst = ...

d = defaultdict(list)
for item in lst:
    d[item[0]].append(item)

result = [[v[0][0]] + [sum(x[1] for x in v)] + v[0][2:] for v in d.values()]

pprint(result)

输出:

[['https://www.website.com/directory/link-1',206,'Long Text Field 1','String 1',{'url': 'https://www.website.com/images/image-1.jpg'},255],['https://www.website.com/directory/link-2',296,'Long Text Field 2',{'url': 'https://www.website.com/images/image-2.jpg'},303],['https://www.website.com/directory/link-3',354,'Long Text Field 3',{'url': 'https://www.website.com/images/image-3.jpg'},388],['https://www.website.com/directory/link-4',606,'Long Text Field 4',{'url': 'https://www.website.com/images/image-4.jpg'},624]]
,

如果您想使用pandas,可以得到如下内容:

                                       Page  Count               Text    String                                         Url  Magic
0  https://www.website.com/directory/link-1     21  Long Text Field 1  String 1  https://www.website.com/images/image-1.jpg    255
1  https://www.website.com/directory/link-1    185  Long Text Field 1  String 1  https://www.website.com/images/image-1.jpg    255
2  https://www.website.com/directory/link-2    296  Long Text Field 2      None  https://www.website.com/images/image-2.jpg    303
3  https://www.website.com/directory/link-3    354  Long Text Field 3      None  https://www.website.com/images/image-3.jpg    388
4  https://www.website.com/directory/link-4    606  Long Text Field 4      None  https://www.website.com/images/image-4.jpg    624

----

                                       Page  Count  Magic    String                                         Url               Text
0  https://www.website.com/directory/link-1    206    255  String 1  https://www.website.com/images/image-1.jpg  Long Text Field 1
1  https://www.website.com/directory/link-2    296    303      None  https://www.website.com/images/image-2.jpg  Long Text Field 2
2  https://www.website.com/directory/link-3    354    388      None  https://www.website.com/images/image-3.jpg  Long Text Field 3
3  https://www.website.com/directory/link-4    606    624      None  https://www.website.com/images/image-4.jpg  Long Text Field 4

通过运行以下代码。请注意,由于您的数据格式有些不一致,因此我必须为缺少的字符串添加伪值。

import pandas as pd

data = [
  ['https://www.website.com/directory/link-1',21,['https://www.website.com/directory/link-1',185,624]
]
columns = ['Page','Count','Text','String','Url','Magic']

for d in data:
    if len(d) != 6:
        d.insert(3,None)
    d[4] = d[4]['url']
df = pd.DataFrame(data,columns=columns)


agg = dict.fromkeys(columns,'first')
agg.update({'Count': 'sum'})
del agg['Page']
df2 = df.groupby(['Page'],as_index=False).agg(agg)

pd.options.display.width = 0
print df
print '\n----\n'
print df2
本文链接:https://www.f2er.com/3148516.html

大家都在问