Kryo setWarnUnregisteredClasses为true在spark配置中什么也没有显示

 val conf = new SparkConf()
    .setappName("example")
    .setMaster("local[*]")
    .set("spark.serializer","org.apache.spark.serializer.KryoSerializer")
    .set("setWarnUnregisteredClasses","true")

registrationRequired设置为true时,它会引发class Person is not registered and also "org.apache.spark.internal.io.FileCommitProtocol$TaskCommitMessage" is not registered的异常

因此,现在默认情况下它为false,因此将setWarnUnregisteredClasses设置为true,它应显示文档https://github.com/EsotericSoftware/kryo#serializer-framework中所提供的未注册类的警告消息吗?但是,有关序列化的日志中未显示任何内容。

我想要做的是通过设置此属性.set("setWarnUnregisteredClasses","true")

将所有未注册类的列表保存到我的日志中

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/12/10 15:56:09 WARN Utils: Your hostname,knoldus-Vostro-3546 resolves to a loopback address: 127.0.1.1; using 192.168.1.113 instead (on interface enp7s0) 19/12/10 15:56:09 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 19/12/10 15:56:10 INFO SparkContext: Running Spark version 2.4.4 19/12/10 15:56:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 19/12/10 15:56:12 INFO SparkContext: Submitted application: kyroExample 19/12/10 15:56:14 INFO SecurityManager: Changing view acls to: knoldus 19/12/10 15:56:14 INFO SecurityManager: Changing modify acls to: knoldus 19/12/10 15:56:14 INFO SecurityManager: Changing view acls groups to: 19/12/10 15:56:14 INFO SecurityManager: Changing modify acls groups to: 19/12/10 15:56:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(knoldus); groups with view permissions: Set(); users with modify permissions: Set(knoldus); groups with modify permissions: Set() 19/12/10 15:56:17 INFO Utils: Successfully started service 'sparkDriver' on port 36235. 19/12/10 15:56:17 INFO SparkEnv: Registering MapOutputTracker 19/12/10 15:56:18 INFO SparkEnv: Registering BlockManagerMaster 19/12/10 15:56:18 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/12/10 15:56:18 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/12/10 15:56:18 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-956a186e-cfbd-4ad2-b531-9f46bff96984 19/12/10 15:56:18 INFO MemoryStore: MemoryStore started with capacity 870.9 MB 19/12/10 15:56:18 INFO SparkEnv: Registering OutputCommitCoordinator 19/12/10 15:56:19 INFO Utils: Successfully started service 'SparkUI' on port 4040. 19/12/10 15:56:19 INFO SparkUI: Bound SparkUI to 0.0.0.0,and started at http://192.168.1.113:4040 19/12/10 15:56:19 INFO Executor: Starting executor ID driver on host localhost 19/12/10 15:56:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41737. 19/12/10 15:56:19 INFO NettyBlockTransferService: Server created on 192.168.1.113:41737 19/12/10 15:56:19 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/12/10 15:56:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver,192.168.1.113,41737,None) 19/12/10 15:56:19 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.113:41737 with 870.9 MB RAM,BlockManagerId(driver,None) 19/12/10 15:56:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver,None) 19/12/10 15:56:19 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver,None) 19/12/10 15:56:21 INFO SparkContext: Starting job: take at KyroExample.scala:28 19/12/10 15:56:21 INFO DAGScheduler: Got job 0 (take at KyroExample.scala:28) with 1 output partitions 19/12/10 15:56:21 INFO DAGScheduler: Final stage: ResultStage 0 (take at KyroExample.scala:28) 19/12/10 15:56:21 INFO DAGScheduler: Parents of final stage: List() 19/12/10 15:56:21 INFO DAGScheduler: Missing parents: List() 19/12/10 15:56:21 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at filter at KyroExample.scala:24),which has no missing parents 19/12/10 15:56:21 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.0 KB,free 870.9 MB) 19/12/10 15:56:22 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1730.0 B,free 870.9 MB) 19/12/10 15:56:22 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.113:41737 (size: 1730.0 B,free: 870.9 MB) 19/12/10 15:56:22 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161 19/12/10 15:56:22 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at filter at KyroExample.scala:24) (first 15 tasks are for partitions Vector(0)) 19/12/10 15:56:22 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 19/12/10 15:56:22 WARN TaskSetManager: Stage 0 contains a task of very large size (243 KB). The maximum recommended task size is 100 KB. 19/12/10 15:56:22 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0,localhost,executor driver,partition 0,PROCESS_LOCAL,249045 bytes) 19/12/10 15:56:22 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 19/12/10 15:56:23 INFO MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 293.3 KB,free 870.6 MB) 19/12/10 15:56:23 INFO BlockManagerInfo: Added rdd_1_0 in memory on 192.168.1.113:41737 (size: 293.3 KB,free: 870.6 MB) 19/12/10 15:56:23 INFO Executor: 1 block locks were not released by TID = 0: [rdd_1_0] 19/12/10 15:56:23 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1132 bytes result sent to driver 19/12/10 15:56:23 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 924 ms on localhost (executor driver) (1/1) 19/12/10 15:56:23 INFO TaskSchedulerImpl: Removed TaskSet 0.0,whose tasks have all completed,from pool 19/12/10 15:56:23 INFO DAGScheduler: ResultStage 0 (take at KyroExample.scala:28) finished in 1.733 s 19/12/10 15:56:23 INFO DAGScheduler: Job 0 finished: take at KyroExample.scala:28,took 1.895530 s

没有未注册的类遇到日志。为什么?

liangshiming1980 回答:Kryo setWarnUnregisteredClasses为true在spark配置中什么也没有显示

我有同样的问题。 问题是setWarnUnregisteredClasses是一个Kryo配置,当前(我实际使用Spark 2.4.4)没有通过Spark公开。 您必须在Kryo中设置特定的配置。 我使用的解决方法是使用自定义KryoRegistrator。 然后我以这种方式使用它:

class MyKryoRegistrator extends KryoRegistrator {

  override def registerClasses(kryo: Kryo): Unit = {
    kryo.setRegistrationRequired(false)
    kryo.setWarnUnregisteredClasses(true)
...
,

您正在使用kryo注册,因此custom和其他类都需要向kryo注册,并且这两个类都应实现序列化接口。

setWarnUnregisteredClasses将发出警告,而conf.set(“ spark.kryo.registrationRequired”,“ true”)会引发未注册类的异常。

您必须像这样注册人员和TaskCommitMessage

conf.registerKryoClasses(Array(classOf[Person]))
本文链接:https://www.f2er.com/2950460.html

大家都在问