编写csv，然后检查列中的值并编写其他数据

2024-05-06 • 问答

到目前为止，我已经将一长串ID代码列表（约45000行）以及其他参考值写入了一个csv文件。数据的结构如下：

12345678 | 2
56789012 | 10
90123456 | 46
...

到目前为止，我编写的用于执行此操作的代码如下：

def list_writer():
    with open (csv_dir + '/' + csv_filename,mode = "w",newline='') as csvfile:
        writer = csv.writer(csvfile,lineterminator='\n',delimiter=';')
        for row in ID_list:
            writer.writerow(row)

list_writer()

每个ID号（左列）与一个参考号（右列）相关联，范围为1-100。我有几个其他列表，这些列表将每个参考号与其他信息（价格，数量等）相关联。

我现在的目标是遍历我编写的长csv文件的第二列中的所有参考数字，并将其他属性写入下一列。我在StackExchange上进行了一些挖掘，但到目前为止没有任何效果。预先感谢！

这听起来像是我在关系型（即SQL）数据库中要做的事情，那里有很多工具可用于验证您的数据并确保所有数据保持一致

如果要在Python中执行此操作，则可以执行以下操作：

# put your "lists of prices" into a dictionary,keyed by the reference number
# assuming the prices is in the form [(ref1,price1),(ref2,price2)]
ref_prices = {}
for ref,price in PRICE_list:
  ref_prices[ref] = price

# do the same for each additional list:
# shorter syntax than the above
ref_quantity = {ref: qty for ref,qty in QTY_list}

# combine all of the above and write into a file
with open(filename,'w') as fd:
  out = csv.writer(fd,delimiter=';')
  for id,ref in ID_list:
    out.writerow((id,ref,ref_prices[ref],ref_quantity[ref]))

这是SQL的完美用例。如果要在Python中实现类似SQL的函数，通常最好使用pandas。它很方便，易于读写，而且速度很快。对于您的情况，假设其他值将存储在元组列表或字典中：

import pandas as pd


csv = [
    (1,10),(2,20),(3,30),]

csv_df = pd.DataFrame(csv,columns=["id","reference"])

# This would be the data you have in your csv. For actually loading them from your 
# csv located at `filepath`,use 
#
#      pd.DataFrame.read_csv(filepath)

additional_data = [
    (1,"a"),"b"),"c"),]  # This could also be a dictionary

additional_df = pd.DataFrame(additional_data,"name"])

final_df = csv_df.merge(additional_df,on="id")

然后我们得到

>>> final_df
   id  reference name
0   1         10    a
1   2         20    b
2   3         30    c

编写csv，然后检查列中的值并编写其他数据

jiangxue2913 回答：编写csv，然后检查列中的值并编写其他数据

大家都在问