Apache Spark-ModuleNotFoundError:没有名为“ mysql”的模块

我正在尝试将Apache Spark驱动程序提交到远程集群。我在使用名为mysql的python软件包时遇到了困难。我在所有Spark节点上安装了此软件包。群集在docker-compose内部运行,图像基于bde2020

$ docker-compose logs  impressions-agg
impressions-agg_1  | Submit application /app/app.py to Spark master spark://spark-master:7077
impressions-agg_1  | Passing arguments 
impressions-agg_1  | 19/11/13 18:45:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
impressions-agg_1  | Traceback (most recent call last):
impressions-agg_1  |   File "/app/app.py",line 6,in <module>
impressions-agg_1  |     from mysql.connector import connect
impressions-agg_1  | ModuleNotFoundError: No module named 'mysql'
impressions-agg_1  | log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
impressions-agg_1  | log4j:WARN Please initialize the log4j system properly.
impressions-agg_1  | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

模块mysql通过pip安装在所有节点上。

$ docker-compose exec spark-master pip list
Package         Version            
--------------- -------------------
mysql-connector 2.2.9              
pip             18.1               
setuptools      40.8.0.post20190503

$ docker-compose exec spark-worker pip list
Package         Version            
--------------- -------------------
mysql-connector 2.2.9              
pip             18.1               
setuptools      40.8.0.post20190503

我该如何解决? 感谢您提供任何信息。

tangshuo4444 回答:Apache Spark-ModuleNotFoundError:没有名为“ mysql”的模块

虽然该节点已安装mysql,但容器却没有。日志告诉您,impressions-agg_1包含一个位于/app/app.py的脚本,该脚本试图加载mysql但找不到它。

您创建了impressions-agg_1吗?在其Dockerfile中添加一个RUN pip install mysql步骤。

本文链接:https://www.f2er.com/3107434.html

大家都在问