问题是对How to store custom objects in Dataset?的跟进
火花版本:3.0.1
可以实现非嵌套的自定义类型:
5.66002e-320 2.31834e-316
1.132e-316 4.63669e-313
1.698e-319 6.95503e-316
1.132e-316 4.63669e-313
5.66002e-320 2.31834e-316
1.132e-316 4.63669e-313
1.698e-319 6.95503e-316
1.132e-316 4.63669e-313
5.66002e-320 2.31834e-316
1.132e-316 4.63669e-313
1.698e-319 6.95503e-316
1.132e-316 4.63669e-313
但是,如果自定义类型在import spark.implicits._
import org.apache.spark.sql.{Encoder,Encoders}
class AnObj(val a: Int,val b: String)
implicit val myEncoder: Encoder[AnObj] = Encoders.kryo[AnObj]
val d = spark.createDataset(Seq(new AnObj(1,"a")))
d.printSchema
root
|-- value: binary (nullable = true)
类型(即product
)内是嵌套,则会出现错误:
java.lang.UnsupportedOperationException:未找到InnerObj的编码器
case class
如何创建具有嵌套自定义类型的import spark.implicits._
import org.apache.spark.sql.{Encoder,Encoders}
class InnerObj(val a: Int,val b: String)
case class myobj(val i: Int,val j: InnerObj)
implicit val myEncoder: Encoder[InnerObj] = Encoders.kryo[InnerObj]
// error
val d = spark.createDataset(Seq(new myobj(1,new InnerObj(0,"a"))))
// it gives Runtime error: java.lang.UnsupportedOperationException: No Encoder found for InnerObj
?