使用Python将特殊字符从Excel处理为CSV

您好,使用python处理从Excel工作表到CSV的特殊字符时遇到问题 当我使用

else:
                    # Encode strings into format to preserve content of cell
                    row_values.append(cell.value.encode("UTF-8").strip())

我的特殊字符为'Â'

以及当我使用

  else:
                    # Encode strings into ISO-8859-1 format to preserve content of cell
                    row_values.append(cell.value.encode("iso-8859-1").strip())

我很容易说出'�'作为特殊角色吗?在钻石中

我认为这与编码有关,但不确定使用哪个编码。这些字符是从Excel工作表转换为CSV的。

这是我使用的代码

def convert_to_csv(excel_file,input_dir,output_dir):
    """Convert an excel file to a CSV file by removing irrelevant data"""
    try:
        sheet = read_excel(excel_file)
    except UnicodeDecodeError:
        print 'File %s is possibly corrupt. Please check again.' % (excel_file)
        sys.exit(1)
    row_num = sheet.get_highest_row()  # Number of rows
    col_num = sheet.get_highest_column()  # Number of columns
    all_rows = []
    # Loop through rows and columns
    for row in range(row_num):
        row_values = []
        for column in range(col_num):
            # Get cell element
            cell = sheet.cell(row=row,column=column)
            # Ignore empty cells
            if cell.value is not None:
                if type(cell.value) == int or type(cell.value) == float:
                    # String encoding not applicable for integers and floating point numbers
                    row_values.append(cell.value)
                else:
                    # Encode strings into ISO-8859-1 format to preserve content of cell
                    row_values.append(cell.value.encode("iso-8859-1").strip())
            else:
                row_values.append('')
        # Append rows only having more than three values each
        if len(set(row_values)-{''}) > 3:
            # print row_values
            all_rows.append(row_values)
    # Saving the data to a csv extension with the same name as the given excel file
    output_path = os.path.join(output_dir,excel_file.split('.')[0] + '.csv')
    with open(output_path,'wb') as f:
        writer = csv.writer(f,delimiter=";",quoting=csv.QUOTE_ALL)

        writer.writerows(all_rows[1:])

使用Python 2.6.9 想知道我们是否可以在写入CSV之前使用常规表达式 无论如何,我们可以处理吗?

预先感谢。

bigege 回答:使用Python将特殊字符从Excel处理为CSV

已经修复

           ` else:
                # Encode strings into ISO-8859-1 format to preserve content of cell
                row_values.append(
                    re.sub(r'[^\x00-\x7f]',r'',cell.value).strip())`
本文链接:https://www.f2er.com/3102616.html

大家都在问