我有几个Excel工作表,这些工作表将转换为csv,但遇到语法无效的错误。我试图仅获取行数大于3的内容。 这是我的代码
def convert_to_csv(excel_file,input_dir,output_dir):
"""Convert an excel file to a CSV file by removing irrelevant data"""
try:
sheet = read_excel(excel_file)
except UnicodeDecodeError:
print 'File %s is possibly corrupt. Please check again.' % (excel_file)
sys.exit(1)
row_num = sheet.get_highest_row() # Number of rows
col_num = sheet.get_highest_column() # Number of columns
all_rows = []
# Loop through rows and columns
for row in range(row_num):
row_values = []
for column in range(col_num):
# Get cell element
cell = sheet.cell(row=row,column=column)
# Ignore empty cells
if cell.value is not None:
if type(cell.value) == int or type(cell.value) == float:
# String encoding not applicable for integers and floating point numbers
row_values.append(cell.value)
else:
# Encode strings into ISO-8859-1 format to preserve content of cell
row_values.append(cell.value.encode("iso-8859-1").strip())
else:
row_values.append('')
# Append rows only having more than three values each
if len(set(row_values)-{''}) > 3:
# print row_values
all_rows.append(row_values)
# Saving the data to a csv extension with the same name as the given excel file
output_path = os.path.join(output_dir,excel_file.split('.')[0] + '.csv')
with open(output_path,'wb') as f:
writer = csv.writer(f)
writer.writerows(all_rows[1:])
print 'File %s saved to %s ' % (excel_file,output_path)
am循环浏览多个excel表,在一个特定的表中,我想删除的尾部有一些不需要的数据。
遇到此错误
19/11/12 03:13:33 WARN SparkConf: The configuration key 'spark.yarn.applicationmaster.waitTries' has been deprecated as of Spark 1.3 and and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead.
File "/u/kim/excel_to_csv.py",line 49
if len(set(row_values)-{''}) > 3:
^
SyntaxError: invalid syntax
这在Pycharm上有效,但在Terminal上失败,我是否试图获取该行中的内容大于3的所有行,是否可以解决?