EKS集群的API与节点之间的连接问题

我的EKS群集运行不正常,并从所有吊舱中出现“ ContainerCreating”错误,这可能与CNI问题有关。

一旦我启动了新的节点工作程序,它们就没有进入“就绪”状态并提示以下错误:

"couldn't get current server API group list; will keep using cached value. (Get https://172.20.0.1:443/api?timeout=32s: dial tcp
172.20.0.1:443: i/o timeout) Failed to communicate with K8S Server. Please check instance security groups or http proxy setting"

我没有使用http代理,并且私有CIDR允许使用安全组(从端口443 Telnet到API服务器正在工作)。

我的CNI版本是1.5.5,根据一些有关此问题的线索,我试图将CNI降级为1.5.3-节点仍未连接,并降为1.5.1-节点已连接为/ etc / cni / net.d / 10-aws.conflist文件存在,但pod无法设法连接到它们。

在1.5.5版中,conflist文件的位置已更改为/etc/cni/10-aws.conflist,但是节点仍处于“未就绪”状态。

我的EKS版本是1.14,平台版本是eks.2。

Ipamd日志:

2019-11-27T09:09:13.446Z [INFO] Starting L-IPAMD v1.5.5  ...
2019-11-27T09:09:43.447Z [INFO] Testing communication with server
2019-11-27T09:10:13.448Z [INFO] Failed to communicate with K8S Server. Please check instance security groups or http proxy setting
2019-11-27T09:10:13.448Z [ERROR]        Failed to create client: error communicating with apiserver: Get https://172.20.0.1:443/version?timeout=32s: dial tcp 172.20.0.1:443: i/o timeout

来自容器的错误是:

Warning  FailedCreatepodSandBox  17m                   kubelet,ip-10-1-1-144.eu-west-1.compute.internal  Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "b02f175d5e68011332655e0d6e6aa3ae226bbd7bf447c7461c0140a7e026d831" network for pod "coredns-759d6fc95f-zx292": NetworkPlugin cni failed to set up pod "coredns-759d6fc95f-zx292_kube-system" network: failed to find plugin "aws-cni" in path [/opt/cni/bin],failed to clean up sandbox container "b02f175d5e68011332655e0d6e6aa3ae226bbd7bf447c7461c0140a7e026d831" network for pod "coredns-759d6fc95f-zx292": NetworkPlugin cni failed to teardown pod "coredns-759d6fc95f-zx292_kube-system" network: failed to find plugin "aws-cni" in path [/opt/cni/bin]]
  Normal   SandboxChanged          2m47s (x70 over 17m)  kubelet,ip-10-1-1-144.eu-west-1.compute.internal  pod sandbox changed,it will be killed and re-created.

CNI图片:602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni:v1.5.5

/opt/cni/bin/aws-cni-support.sh脚本输出: /opt/cni/bin/aws-cni-support.sh

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 61679: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 61679: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 61679: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 61679: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 61679: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 61678: Connection refused
tar: Removing leading `/' from member names
/var/log/aws-routed-eni/
/var/log/aws-routed-eni/ipamd.log.2019-11-27-09
/var/log/aws-routed-eni/ipamd.log.2019-11-27-10
/var/log/aws-routed-eni/eni.out
/var/log/aws-routed-eni/pod.out
/var/log/aws-routed-eni/networkutils-env.out
/var/log/aws-routed-eni/ipamd-env.out
/var/log/aws-routed-eni/eni-configs.out
/var/log/aws-routed-eni/metrics.out
/var/log/aws-routed-eni/ifconfig.out
/var/log/aws-routed-eni/iprule.out
/var/log/aws-routed-eni/iptables-save.out
/var/log/aws-routed-eni/iptables.out
/var/log/aws-routed-eni/iptables-nat.out
/var/log/aws-routed-eni/iptables-mangle.out
/var/log/aws-routed-eni/cni/
/var/log/aws-routed-eni/cni/10-aws.conflist
/var/log/aws-routed-eni/messages
/var/log/aws-routed-eni/route.out
/var/log/aws-routed-eni/sysctls.out

此外,/ var / log / aws-routed-eni / messages中还会出现以下许多错误: 网络:无法在路径[/ opt / cni / bin]中找到插件\“ aws-cni \”“

没有/ opt / cni / bin / aws-cni文件。

有人对这个问题可能有什么线索吗?

A7726A 回答:EKS集群的API与节点之间的连接问题

我遇到了同样的问题,问题出在kube-proxy。

看,aws-cni插件实际上是由aws-node吊舱下载的,因此,如果它们无法连接到主服务器,则不会发生,因此缺少配置文件和二进制文件。 对我来说,解决该问题的是修复了kube-proxy配置(由于现在不支持的标志--resource-container,这是错误的)。这可能不是您遇到的问题,但是我绝对会检查kube-proxies并查看日志中是否有任何问题。 这些值无法通过kubectl logs ...获得,但存储在节点上的/var/log/kube-proxy.log中。

本文链接:https://www.f2er.com/3023045.html

大家都在问