我在生产中有5个运行中的火花
Node1: Worker
Node2: Worker
Node3: Worker
Node4: Worker
Node5: Master
5个节点位于LAN网络中,只有Master Node
的IP是公用的。
情况1 :我的SQL Server节点仅在防火墙中为Node5
启用连接。当用火花运行作业时,我收到此消息:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times,most recent failure: Lost task 0.3 in stage 0.0 (TID 3,10.158.6.95,executor 2): com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host 42.113.207.214,port 1433 has failed. Error: "Connection timed out: no further information. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
情况2 :当我的SQL-Server节点禁用防火墙,然后成功运行带有spark的作业
启用防火墙时,我从所有Worker
连接到SQL Server时遇到问题。
如何设置所有Worker
通过Node5通过SSH连接到SQL Server?
PS:我的SQL Server无法启用来自Worker的连接,因为网络不同。