我正在尝试使用pyspark读取csv文件,但显示一些错误。 您能告诉我读取csv文件的正确过程是什么吗?
python代码:
from pyspark.sql import *
df = spark.read.csv("D:\Users\SPate233\Downloads\iMedical\query1.csv",inferSchema = True,header = True)
我也尝试过以下一种方法:
sqlContext = SQLContext
df = sqlContext.load(source="com.databricks.spark.csv",header="true",path = "D:\Users\SPate233\Downloads\iMedical\query1.csv")
错误:
Traceback (most recent call last):
File "<pyshell#18>",line 1,in <module>
df = spark.read.csv("D:\Users\SPate233\Downloads\iMedical\query1.csv",header = True)
NameError: name 'spark' is not defined
and
Traceback (most recent call last):
File "<pyshell#26>",in <module>
df = sqlContext.load(source="com.databricks.spark.csv",path = "D:\Users\SPate233\Downloads\iMedical\query1.csv")
AttributeError: type object 'SQLContext' has no attribute 'load'