bash – Kubernetes:如何调试CrashLoopBackOff

前端之家收集整理的这篇文章主要介绍了bash – Kubernetes:如何调试CrashLoopBackOff前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我有以下设置:

在码头工人中心的码头工作者图像omg / telperion
一个kubernetes集群(有4个节点,每个节点有~50GB RAM)和充足的资源

我按照教程将图像从dockerhub拉到kubernetes

  1. SERVICE_NAME=telperion
  2. DOCKER_SERVER="https://index.docker.io/v1/"
  3. DOCKER_USERNAME=username
  4. DOCKER_PASSWORD=password
  5. DOCKER_EMAIL="omg@whatever.com"
  6.  
  7. # Create secret
  8. kubectl create secret docker-registry dockerhub --docker-server=$DOCKER_SERVER --docker-username=$DOCKER_USERNAME --docker-password=$DOCKER_PASSWORD --docker-email=$DOCKER_EMAIL
  9.  
  10. # Create service yaml
  11. echo "apiVersion: v1 \n\
  12. kind: Pod \n\
  13. Metadata: \n\
  14. name: ${SERVICE_NAME} \n\
  15. spec: \n\
  16. containers: \n\
  17. - name: ${SERVICE_NAME} \n\
  18. image: omg/${SERVICE_NAME} \n\
  19. imagePullPolicy: Always \n\
  20. command: [ \"echo\",\"done deploying $SERVICE_NAME\" ] \n\
  21. imagePullSecrets: \n\
  22. - name: dockerhub" > $SERVICE_NAME.yaml
  23.  
  24. # Deploy to kubernetes
  25. kubectl create -f $SERVICE_NAME.yaml

这导致pod进入CrashLoopBackoff

docker run -it -p8080:9546 omg / telperion工作正常.

所以我的问题是
这是可调试的吗?如果是的话,我该如何调试呢?

一些日志:

  1. kubectl get nodes
  2. NAME STATUS AGE VERSION
  3. k8s-agent-adb12ed9-0 Ready 22h v1.6.6
  4. k8s-agent-adb12ed9-1 Ready 22h v1.6.6
  5. k8s-agent-adb12ed9-2 Ready 22h v1.6.6
  6. k8s-master-adb12ed9-0 Ready,SchedulingDisabled 22h v1.6.6

.

  1. kubectl get pods
  2. NAME READY STATUS RESTARTS AGE
  3. telperion 0/1 CrashLoopBackOff 10 28m

.

  1. kubectl describe pod telperion
  2. Name: telperion
  3. Namespace: default
  4. Node: k8s-agent-adb12ed9-2/10.240.0.4
  5. Start Time: Wed,21 Jun 2017 10:18:23 +0000
  6. Labels: <none>
  7. Annotations: <none>
  8. Status: Running
  9. IP: 10.244.1.4
  10. Controllers: <none>
  11. Containers:
  12. telperion:
  13. Container ID: docker://c2dd021b3d619d1d4e2afafd7a71070e1e43132563fdc370e75008c0b876d567
  14. Image: omg/telperion
  15. Image ID: docker-pullable://omg/telperion@sha256:c7e3beb0457b33cd2043c62ea7b11ae44a5629a5279a88c086ff4853828a6d96
  16. Port:
  17. Command:
  18. echo
  19. done deploying telperion
  20. State: Waiting
  21. Reason: CrashLoopBackOff
  22. Last State: Terminated
  23. Reason: Completed
  24. Exit Code: 0
  25. Started: Wed,21 Jun 2017 10:19:25 +0000
  26. Finished: Wed,21 Jun 2017 10:19:25 +0000
  27. Ready: False
  28. Restart Count: 3
  29. Environment: <none>
  30. Mounts:
  31. /var/run/secrets/kubernetes.io/serviceaccount from default-token-n7ll0 (ro)
  32. Conditions:
  33. Type Status
  34. Initialized True
  35. Ready False
  36. PodScheduled True
  37. Volumes:
  38. default-token-n7ll0:
  39. Type: Secret (a volume populated by a Secret)
  40. SecretName: default-token-n7ll0
  41. Optional: false
  42. QoS Class: BestEffort
  43. Node-Selectors: <none>
  44. Tolerations: <none>
  45. Events:
  46. FirstSeen LastSeen Count From SubObjectPath Type Reason Message
  47. --------- -------- ----- ---- ------------- -------- ------ -------
  48. 1m 1m 1 default-scheduler Normal Scheduled Successfully assigned telperion to k8s-agent-adb12ed9-2
  49. 1m 1m 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Created Created container with id d9aa21fd16b682698235e49adf80366f90d02628e7ed5d40a6e046aaaf7bf774
  50. 1m 1m 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Started Started container with id d9aa21fd16b682698235e49adf80366f90d02628e7ed5d40a6e046aaaf7bf774
  51. 1m 1m 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Started Started container with id c6c8f61016b06d0488e16bbac0c9285fed744b933112fd5d116e3e41c86db919
  52. 1m 1m 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Created Created container with id c6c8f61016b06d0488e16bbac0c9285fed744b933112fd5d116e3e41c86db919
  53. 1m 1m 2 kubelet,k8s-agent-adb12ed9-2 Warning FailedSync Error syncing pod,skipping: Failed to "StartContainer" for "telperion" with CrashLoopBackOff: "Back-off 10s restarting Failed container=telperion pod=telperion_default(f4e36a12-566a-11e7-99a6-000d3aa32f49)"
  54.  
  55. 1m 1m 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Started Started container with id 3b911f1273518b380bfcbc71c9b7b770826c0ce884ac876fdb208e7c952a4631
  56. 1m 1m 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Created Created container with id 3b911f1273518b380bfcbc71c9b7b770826c0ce884ac876fdb208e7c952a4631
  57. 1m 1m 2 kubelet,k8s-agent-adb12ed9-2 Warning FailedSync Error syncing pod,skipping: Failed to "StartContainer" for "telperion" with CrashLoopBackOff: "Back-off 20s restarting Failed container=telperion pod=telperion_default(f4e36a12-566a-11e7-99a6-000d3aa32f49)"
  58.  
  59. 1m 50s 4 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Pulling pulling image "omg/telperion"
  60. 47s 47s 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Started Started container with id c2dd021b3d619d1d4e2afafd7a71070e1e43132563fdc370e75008c0b876d567
  61. 1m 47s 4 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Pulled Successfully pulled image "omg/telperion"
  62. 47s 47s 1 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Normal Created Created container with id c2dd021b3d619d1d4e2afafd7a71070e1e43132563fdc370e75008c0b876d567
  63. 1m 9s 8 kubelet,k8s-agent-adb12ed9-2 spec.containers{telperion} Warning BackOff Back-off restarting Failed container
  64. 46s 9s 4 kubelet,skipping: Failed to "StartContainer" for "telperion" with CrashLoopBackOff: "Back-off 40s restarting Failed container=telperion pod=telperion_default(f4e36a12-566a-11e7-99a6-000d3aa32f49)"

编辑1:
kubelet在master上报告的错误

  1. journalctl -u kubelet

.

  1. Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: E0621 10:28:49.798140 1809 fsHandler.go:121] Failed to collect filesystem stats - rootDiskErr: du command Failed on /var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce with output
  2. Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]:,stderr: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/task/13122/fd/4': No such file or directory
  3. Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/task/13122/fdinfo/4': No such file or directory
  4. Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/fd/3': No such file or directory
  5. Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/fdinfo/3': No such file or directory
  6. Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: - exit status 1,rootInodeErr: <nil>,extraDiskErr: <nil>

编辑2:更多日志

  1. kubectl logs $SERVICE_NAME -p
  2. done deploying telperion
您可以使用访问pod的日志
  1. kubectl logs [podname] -p

-p选项将读取上一个(崩溃的)实例的日志

如果崩溃来自应用程序,那么您应该有有用的日志.

猜你在找的Bash相关文章