Pyspark作业可能的资源限制问题

我正在用aws进行胶水作业。它基本上在aws胶工作中运行pyspark代码。这项工作会连接到几个ec2实例。对于较少数量的实例,它可以正常运行,但是当我将其扩展到较大数量的实例时,它就会失败,并且最终错误日志消息如下。我想知道代码是由于实例之一或代码的某些部分出现问题而失败,还是由于默认的胶粘作业设置引起资源限制?

我发现一个堆栈溢出消息提到信号术语错误,表明该问题可能与内存或动态时间分配有关,这可能是问题所在,如果是的话,我可以更改哪些参数进行测试?

SO邮政: Spark Error : executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM

错误日志:

2019-11-06 09:21:18,189 INFO  [Executor task launch worker for task 26635] memory.MemoryStore (Logging.scala:logInfo(54)) - Block broadcast_477 stored as values in memory (estimated size 9.1 KB,free 2.8 GB)
2019-11-06 09:21:18,190 INFO  [dispatcher-event-loop-0] executor.CoarseGrainedExecutorBackend (Logging.scala:logInfo(54)) - Got assigned task 26637
2019-11-06 09:21:18,191 INFO  [Executor task launch worker for task 26637] executor.Executor (Logging.scala:logInfo(54)) - Running task 0.0 in stage 477.0 (TID 26637)
2019-11-06 09:21:18,191 INFO  [Executor task launch worker for task 26637] broadcast.TorrentBroadcast (Logging.scala:logInfo(54)) - Started reading broadcast variable 479
2019-11-06 09:21:18,193 INFO  [Executor task launch worker for task 26637] memory.MemoryStore (Logging.scala:logInfo(54)) - Block broadcast_479_piece0 stored as bytes in memory (estimated size 5.1 KB,194 INFO  [Executor task launch worker for task 26637] broadcast.TorrentBroadcast (Logging.scala:logInfo(54)) - Reading broadcast variable 479 took 3 ms
2019-11-06 09:21:18,194 INFO  [Executor task launch worker for task 26637] memory.MemoryStore (Logging.scala:logInfo(54)) - Block broadcast_479 stored as values in memory (estimated size 9.1 KB,640 INFO  [Executor task launch worker for task 26629] codegen.CodeGenerator (Logging.scala:logInfo(54)) - Code generated in 13.337938 ms
2019-11-06 09:21:18,841 INFO  [Executor task launch worker for task 26629] glue.JDBCRDD (Logging.scala:logInfo(54)) - closed connection
2019-11-06 09:21:18,884 INFO  [Executor task launch worker for task 26629] executor.Executor (Logging.scala:logInfo(54)) - Finished task 0.0 in stage 469.0 (TID 26629). 1366 bytes result sent to driver
2019-11-06 09:21:19,156 INFO  [Executor task launch worker for task 26637] glue.JDBCRDD (Logging.scala:logInfo(54)) - closed connection
2019-11-06 09:21:19,230 INFO  [Executor task launch worker for task 26637] executor.Executor (Logging.scala:logInfo(54)) - Finished task 0.0 in stage 477.0 (TID 26637). 1366 bytes result sent to driver
2019-11-06 09:21:23,308 INFO  [Executor task launch worker for task 26635] glue.JDBCRDD (Logging.scala:logInfo(54)) - closed connection
2019-11-06 09:21:23,790 INFO  [Executor task launch worker for task 26635] executor.Executor (Logging.scala:logInfo(54)) - Finished task 0.0 in stage 475.0 (TID 26635). 1366 bytes result sent to driver
2019-11-06 09:21:23,940 INFO  [Executor task launch worker for task 26624] glue.JDBCRDD (Logging.scala:logInfo(54)) - closed connection
2019-11-06 09:21:24,279 INFO  [Executor task launch worker for task 26624] executor.Executor (Logging.scala:logInfo(54)) - Finished task 0.0 in stage 464.0 (TID 26624). 1366 bytes result sent to driver
2019-11-06 09:22:26,134 ERROR [SIGTERM handler] executor.CoarseGrainedExecutorBackend (SignalUtils.scala:apply$mcZ$sp(43)) - RECEIVED SIGNAL TERM
2019-11-06 09:22:26,139 INFO  [pool-7-thread-1] storage.DiskBlockManager (Logging.scala:logInfo(54)) - Shutdown hook called
2019-11-06 09:22:26,139 INFO  [pool-7-thread-1] util.ShutdownHookManager (Logging.scala:logInfo(54)) - Shutdown hook called
End of LogType:stdout
LIUYAOZU 回答:Pyspark作业可能的资源限制问题

暂时没有好的解决方案,如果你有好的解决方案,请发邮件至:iooj@foxmail.com
本文链接:https://www.f2er.com/3149249.html

大家都在问