节点池群集未自动扩展

我们已经创建了一个GKE集群,并将其设置为区域A和B中的europe-west2。该集群设置为:

节点数:1(总共2个) 自动缩放:是(每个区域1-4个节点)

我们正在尝试测试自动伸缩,并且群集无法调度任何pod,并且不添加任何其他节点。

W 2019-11-11T14:03:17Z unable to get metrics for resource cpu: no metrics returned from resource metrics API 
W 2019-11-11T14:03:20Z unable to get metrics for resource cpu: no metrics returned from resource metrics API 
I 2019-11-11T14:04:42Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached 
I 2019-11-11T14:04:42Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:44Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:45Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:45Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:45Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:45Z 0/4 nodes are available: 4 Insufficient cpu. 
W 2019-11-11T14:04:51Z unable to get metrics for resource cpu: no metrics returned from resource metrics API 
I 2019-11-11T14:04:53Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached 
I 2019-11-11T14:05:03Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached

大约有80%的豆荚处于无法计划的状态,并且显示为处于错误状态。但是我们从来没有看到群集的大小增加(不是物理的也不是水平的)。

我们从2节点设置开始,并进行了负载测试以使其达到最大。两个节点上的CPU达到100%,两个节点上的RAM达到95%。我们收到此错误消息:

I 2019-11-11T16:01:21Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached 
I 2019-11-11T16:01:21Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached 
I 2019-11-11T16:01:21Z Ensuring load balancer 
W 2019-11-11T16:01:24Z Error creating load balancer (will retry): failed to ensure load balancer for service istio-system/istio-ingressgateway: failed to ensure a static IP for load balancer (a72c616b7f5cf11e9b4694201ac10480(istio-system/istio-ingressgateway)): error getting static IP address: googleapi: Error 404: The resource 'projects/gc-lotto-stage/regions/europe-west2/addresses/a72c616b7f5cf11e9b4694201ac10480' was not found,notFound 
W 2019-11-11T16:01:25Z missing request for cpu 
W 2019-11-11T16:01:25Z missing request for cpu 
W 2019-11-11T16:01:26Z missing request for cpu 
I 2019-11-11T16:01:31Z pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 max cluster cpu,memory limit reached 
W 2019-11-11T16:01:35Z missing request for cpu 
W 2019-11-11T16:01:44Z 0/2 nodes are available: 2 Insufficient cpu. 
W 2019-11-11T16:01:44Z 0/2 nodes are available: 2 Insufficient cpu. 
zx705 回答:节点池群集未自动扩展

这也取决于配置的节点大小:

首先查看节点可分配资源:

Kubectl describe node <node>
Allocatable:
  cpu:                4
  ephemeral-storage:  17784772Ki
  hugepages-2Mi:      0
  memory:             4034816Ki
  pods:               110

还要检查已分配的资源:

Allocated resources:
  Kubectl describe node <node>
  (Total limits may be over 100 percent,i.e.,overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                1505m (37%)   3 (75%)
  memory             2750Mi (69%)  6484Mi (164%)
  ephemeral-storage  0 (0%)        0 (0%)

然后查看资源请求:

如果CPU请求/内存请求多于节点可分配的资源,则该节点可能无法自动伸缩。节点具有足够的能力来处理pod请求。

理想情况下,可分配资源小于实际容量,因为系统会将部分容量分配给系统守护程序。

,

一段时间以来,我遇到了同样的问题,经过大量研究和跟踪发现,如果要在GKE中实现群集自动扩展,则必须牢记一些事情。

  1. 设置资源请求并限制每个可能的工作负载

  2. 自动缩放可按要求工作,而不受限制。因此,如果您的工作负载所有请求的总和仅超过节点池中可用的总资源,那么您将看到它正在扩展。

这帮了我大忙。

希望有帮助。

本文链接:https://www.f2er.com/3123315.html

大家都在问