我的Hive表是用orange = {
"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,}
if all(orange[key] == 0 for key in ["1","2","3"]):
print("hi")
定义的
在PySpark中写入表格时,我做了
PARTITIONED BY (ds STRING,model STRING)
但是我遇到以下错误:
output_df
.repartition(250)
.write
.mode('overwrite')
.format('parquet')\
.partitionBy('ds','model')\
.saveAsTable('{table_schema}.{table_name}'.format(table_schema=table_schema,table_name=table_name))
Spark或Hive似乎将org.apache.hadoop.hive.ql.metadata.Table.ValidationFailureSemanticException: Partition spec {ds=2019-10-06,model=p1kr,table_name=drv_projection_table} contains non-partition columns
误认为是一个分区。我的表的S3路径为table_name
,但未将s3://some_path/qubole/table_name=drv_projection_table
指定为分区的一部分。