将列表项拆分为csv列

2024-05-14 • 问答

我正在尝试创建一个传输应用程序，该应用程序将使用文件名并将其转换为已发行文档的csv记录。当前，python调用给定文件夹中的所有文件名，创建一个列表，并将文件名分为文档编号，修订版和标题。

目前，我已经能够使用python抓取文件名，创建此信息的列表，然后将其拆分以创建单独数据的新列表，即 documentnumber，revision，title.pdf 到 [文档编号，修订版，标题]。

    def getFiles():

    i = 0
    path = input("Paste in path for outgoing folder: ")
    numTitleRev = os.listdir(path)
    issueRec = []
    fileData = []
    totalList = len(numTitleRev)
    listNumber = str(totalList)
    print('\n' + "The total amount of documents in this folder is: " + listNumber + '\n')

    csvOutput = []

    while i < totalList:
        for item in numTitleRev:
            fileSplit = item.split(',',2)
            fileTitle = fileSplit.pop(2)
            fileRev = fileSplit.pop(1)
            fileNum = fileSplit.pop(0)

        csvOutput.append([fileNum,fileRev,fileTitle])


        with open('output.csv','a') as writeCSV:
            writer = csv.writer(writeCSV)
            for row in csvOutput:
                writer.writerow(row)

        i += 1

    writeCSV.close()

    print("Writing complete")

The output I'm looking for is like so:
Number - Revision - Title
File1  - 01       - Title 1
File2  - 03       - Title 2 etc.

上面的代码是对列表进行拆分的过程，并通过“”记录记录，这是文件名在文件夹中的存储方式。

我认为我在以下代码中遇到的问题是csvOutput仅将一个结果发送到CSV，即字符串的最后一个结果。

然后将其打印在csv中以获取文件夹中文件总数，而不是拆分列表记录一，发送至csv，并重复记录二。

问题在于，当文件总数不是恒定的时候，我想不出如何将这些信息存储为变量。

任何帮助将不胜感激。

主要问题是嵌套的while/for循环。我对代码进行了一些重组，以使其在本地可测试（并且仅通过复制/粘贴即可运行）。这也应该使您了解如何构造代码以使其更容易寻求帮助。

我添加了很多评论来解释我所做的更改。

import csv

# This part has ben extracted from the main logic,to make the code runnable
# with sample data (see main() below)
def getFiles():
    path = input("Paste in path for outgoing folder: ")
    numTitleRev = os.listdir(path)
    print("\nThe total amount of documents in this folder is: %s\n" % len(numTitleRev))
    return numTitleRev


# This piece of logic contained the core error. The nested "while" loop was
# unnecessary. Additionally,the ".append" call wass on the wrong indent-level.
# Removing the unnecessary while-loop makes this much clearer
def process_files(filenames):
    parsed = []
    for item in filenames:
        # Using "pop()" is a destructive operation (it modifies the list
        # in-place which may leed to bugs). In this case it's absolutely fine,# but I replaced it with a different syntax which in turn also makes
        # the code a bit nicer to read.
        fileNum,fileRev,fileTitle = item.split(',',2)
        parsed.append([fileNum,fileTitle])
    return parsed


# Similarly to "getFiles",I extracted this to make testing easier. Both
# "getFiles" and "write_output" are functions with "side-effects" which rely on
# external resources (the disk in this case). Extracting the main logic into
# "process_files" makes that function easily testable without the need of
# having the files really exist on the disk.
def write_output(parsed_data):
    with open('output.csv','a') as writeCSV:
        writer = csv.writer(writeCSV)
        for row in parsed_data:
            writer.writerow(row)
    print("Writing complete")


# This is just a simple main function to illustrate how the new functions are
# called.
def main():
    filenames = [  # <-- Some example data to make the SO answer runnable
        '0,1,this is an example.txt','1,4,'2,200,this is an example,with a comma in the name.txt','3,]
    # filenames = getFiles()  <-- This needs to be enabled for the real code
    converted = process_files(filenames)
    write_output(converted)

# This special block prevents "import side-effects" when this Python file would
# be imported somewhere else.
if __name__ == '__main__':
    main()

您应该在循环之前初始化csvOutput = []，并在每次迭代csvOutput.append([fileNum,fileTitle])时对其进行更新。这应该可以解决仅存储最后一次迭代数据的问题。

我假设这是一个循环访问数据while i < totalList:的循环，但是您不使用i计数器来提取适当的数据块，而是一遍又一遍地对同一数据执行内部循环

如果您有非恒定数据，则可以像在内部循环中那样对其进行迭代，但这只是一个猜测，您需要提供准确的数据结构以及它所遇到的问题以获得更好的答案。

将列表项拆分为csv列

wx071134 回答：将列表项拆分为csv列

大家都在问