我正在从csv文件加载,并且数据之间用空格隔开。将数据加载到最终表中后,它将加载额外的NULL,这是实际数据的额外一行。
实际数据
id first_name last_name email gender ip_address
1 James Coleman jcoleman0@cam.ac.uk Male 136.90.241.52
2 Lillian Lawrence llawrence1@statcounter.com Female 101.177.15.130
3 Theresa Hall thall2@sohu.com Female 114.123.153.64
4 Samuel Tucker stucker3@sun.com Male 89.60.227.31
5 Emily Dixon edixon4@surveymonkey.com Female 119.92.21.19
表创建
create table serde_sample(id int,first_name string,last_name string,email string,gender string,ip_address string)
row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
"separatorChar" = "\t"
)
tblproperties('skip.header.line.count'='1')
;
LOAD DATA LOCAL INPATH '/home/cloudera/Desktop/files/serde.csv' into table serde_sample;
获得输出
NULL NULL NULL NULL NULL NULL
1 James Coleman jcoleman0@cam.ac.uk Male 136.90.241.52
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL
2 Lillian Lawrence llawrence1@statcounter.com Female 101.177.15.130
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL
3 Theresa Hall thall2@sohu.com Female 114.123.153.64
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL
4 Samuel Tucker stucker3@sun.com Male 89.60.227.31
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL
5 Emily Dixon edixon4@surveymonkey.com Female 119.92.21.19
NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL
我不确定问题出在哪里。为什么会有多余的NULL行出现。 有人可以帮忙解决此问题