前言
当系统资源不足时,kubelet 会自动进行镜像和容器的回收。
影响镜像垃圾回收的关键参数有:
--image-gc-high-threshold:磁盘使用率上限,有效范围 [0-100],默认 85
--image-gc-low-threshold: 磁盘使用率下限,有效范围 [0-100],默认 80
容器垃圾回收相关的控制参数主要有:
--maximum-dead-containers-per-container:每个 pod 上可以留下运行结束之后的容器的个数,默认值为 2
--maximum-dead-containers:节点可保留的死亡容器的最大数量,默认值是 -1,这意味着节点没有限制死亡容器数量
参数配置文件
# vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
......
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --image-gc-high-threshold=90 ......
# 重启 kubelet
systemctl daemon-reload && systemctl restart kubelet
1. 镜像回收
1.1 拉取一些镜像
cat > images.list << EOF
docker.io/nginx:1.14.0
docker.io/nginx:1.18.0
docker.io/redis:5.0.5
docker.io/redis:6.0.6
docker.io/mysql:5.7.31
docker.io/mysql:5.7.27
......
EOF
cat > images.sh << EOF
#!/bin/bash
while read line
do
docker pull \$line
done < images.list
EOF
bash images.sh
1.2 给磁盘施压观察是否有镜像回收
这些镜像都没有被容器引用,当磁盘使用率超过 85% 时,应该会有镜像被 kubelet 回收,观察 kubelet 日志。
[root@node02 ~]$docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
wctdevops/zcache v9.0.21 89341370bfa6 3 weeks ago 280MB
whalecloud/zcache v9.0.21 89341370bfa6 3 weeks ago 280MB
eureka v1 e6ee27304b44 3 weeks ago 578MB
openresty 1.21.4.1 0d69e03d75d5 3 weeks ago 830MB
rabbitmq latest 31b721acc90a 3 weeks ago 228MB
mysql 5.7.37 82d2d47667cf 4 months ago 450MB
registry.aliyuncs.com/google_containers/kube-proxy v1.22.5 8f8fdd6672d4 9 months ago 104MB
registry.aliyuncs.com/google_containers/coredns v1.8.4 8d147537fb7d 15 months ago 47.6MB
quay.io/coreos/flannel v0.14.0 8522d622299c 16 months ago 67.9MB
nginx 1.18.0 c2c45d506085 17 months ago 133MB
busybox 1.32 388056c9a683 17 months ago 1.23MB
registry.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 18 months ago 683kB
harbor.junengcloud.com/pinpoint/zookeeper 3.4 721354d41dae 23 months ago 257MB
mysql 5.7.31 42cdba9f1b08 23 months ago 448MB
redis 6.0.6 1319b1eaa0b7 2 years ago 104MB
dk.uino.cn/java/java8 1.0 6e569a031a68 2 years ago 488MB
192.168.10.88/redis/redis 5.0.5 63130206b0fa 3 years ago 98.2MB
redis 5.0.5 63130206b0fa 3 years ago 98.2MB
mysql 5.7.27 383867b75fd2 3 years ago 373MB
nginx 1.17.1 98ebf73aba75 3 years ago 109MB
nginx 1.14.0 ecc98fc2f376 3 years ago 109MB
quay.io/external_storage/nfs-client-provisioner latest 16d2f904b0d8 4 years ago 45.5MB
registry.cn-hangzhou.aliyuncs.com/k8s-image01/kubernetes-zookeeper 1.0-3.4.10 d3d6696b345b 4 years ago 273MB
磁盘写入大量数据
[root@node02 ~]$df -h | grep /dev/sda2
/dev/sda2 30G 8.5G 22G 29% /
[root@node02 ~]$dd if=/dev/zero of=/root/bigfile.txt bs=1G count=20
......
[root@node02 ~]$df -h | grep /dev/sda2
/dev/sda2 30G 26G 4.3G 91% /
经过一段时间,大概 1-2 min,观察镜像是否被回收和磁盘占用
[root@node02 ~]$docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
wctdevops/zcache v9.0.21 89341370bfa6 3 weeks ago 280MB
whalecloud/zcache v9.0.21 89341370bfa6 3 weeks ago 280MB
eureka v1 e6ee27304b44 3 weeks ago 578MB
openresty 1.21.4.1 0d69e03d75d5 3 weeks ago 830MB
rabbitmq latest 31b721acc90a 3 weeks ago 228MB
mysql 5.7.37 82d2d47667cf 4 months ago 450MB
registry.aliyuncs.com/google_containers/kube-proxy v1.22.5 8f8fdd6672d4 9 months ago 104MB
registry.aliyuncs.com/google_containers/coredns v1.8.4 8d147537fb7d 15 months ago 47.6MB
quay.io/coreos/flannel v0.14.0 8522d622299c 16 months ago 67.9MB
nginx 1.18.0 c2c45d506085 17 months ago 133MB
busybox 1.32 388056c9a683 17 months ago 1.23MB
registry.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 18 months ago 683kB
harbor.junengcloud.com/pinpoint/zookeeper 3.4 721354d41dae 23 months ago 257MB
mysql 5.7.31 42cdba9f1b08 23 months ago 448MB
redis 6.0.6 1319b1eaa0b7 2 years ago 104MB
dk.uino.cn/java/java8 1.0 6e569a031a68 2 years ago 488MB
192.168.10.88/redis/redis 5.0.5 63130206b0fa 3 years ago 98.2MB
redis 5.0.5 63130206b0fa 3 years ago 98.2MB
mysql 5.7.27 383867b75fd2 3 years ago 373MB
nginx 1.17.1 98ebf73aba75 3 years ago 109MB
nginx 1.14.0 ecc98fc2f376 3 years ago 109MB
quay.io/external_storage/nfs-client-provisioner latest 16d2f904b0d8 4 years ago 45.5MB
registry.cn-hangzhou.aliyuncs.com/k8s-image01/kubernetes-zookeeper 1.0-3.4.10 d3d6696b345b 4 years ago 273MB
[root@node02 ~]$docker images | wc -l
24
......
......
# 镜像被大量清理
[root@node02 ~]$docker images | wc -l
4
[root@node02 ~]$docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.22.5 8f8fdd6672d4 9 months ago 104MB
quay.io/coreos/flannel v0.14.0 8522d622299c 16 months ago 67.9MB
registry.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 18 months ago 683kB
# 磁盘占用回到 85 以下
[root@node02 ~]$df -h | grep /dev/sda2
/dev/sda2 30G 24G 6.9G 78% /
观察 kubelet 日志发现 image_gc 信息
......
Sep 14 13:41:36 node02 kubelet: I0914 13:41:36.368154 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:63130206b0fa808e4545a0cb4a1f14f6d40b8a7e2e6fda0a31fd326c2ac0971c" size=98185582
Sep 14 13:41:36 node02 kubelet: I0914 13:41:36.537365 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:6e569a031a681fe9eb5355245369e9772128ae382a175e56b4f12dbd23f7c405" size=488344977
Sep 14 13:41:37 node02 kubelet: I0914 13:41:37.019139 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:383867b75fd22e6c8ca3ef2a1042339ec2d5b655365107547eac94e918309b91" size=373308853
Sep 14 13:41:37 node02 kubelet: I0914 13:41:37.701344 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:ecc98fc2f376d6560311b66d6958e4350a5a485ee07aa2d1235842d0bce440da" size=108935718
Sep 14 13:41:37 node02 kubelet: I0914 13:41:37.880128 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:1319b1eaa0b7bcebae63af321fa67559b9517e8494060403d083bb3508fe52c8" size=104160794
Sep 14 13:41:38 node02 kubelet: I0914 13:41:38.051093 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:42cdba9f1b0840cd63254898edeaf6def81a503a6a53d57301c3b38e69cd8f15" size=448489007
Sep 14 13:41:38 node02 kubelet: I0914 13:41:38.319540 102143 image_gc_manager.go:375] "Removing image to free bytes" imageID="sha256:c2c45d506085d300b72a6d4b10e3dce104228080a2cf095fc38333afe237e2be" size=132899597
Sep 14 13:41:38 node02 kubelet: I0914 13:41:38.543702 102143 eviction_manager.go:346] "Eviction manager: able to reduce resource pressure without evicting pods." resourceName="ephemeral-storage"
1.3 测试是否可以删除正在被使用的镜像
运行两个 pod
kubectl create deployment nginx-1.14.0 --image=nginx:1.14.0 --port=8888 --replicas=1
kubectl create deployment nginx-1.18.0 --image=nginx:1.18.0 --port=18888 --replicas=1
#------------------------------------------------------------------------------------
[root@master ~]#kubectl get pods -o wide | grep nginx
nginx-1.14.0-767cb6d45b-t84pj 1/1 Running 0 4m10s 10.244.2.91 node02 <none> <none>
nginx-1.18.0-5b568cd4d9-fdsm5 1/1 Running 0 74s 10.244.2.92 node02 <none> <none>
查看当前镜像和磁盘压力
[root@node02 ~]$df -h | grep /dev/sda2
/dev/sda2 30G 25G 5.6G 82% /
[root@node02 ~]$docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.22.5 8f8fdd6672d4 9 months ago 104MB
quay.io/coreos/flannel v0.14.0 8522d622299c 16 months ago 67.9MB
nginx 1.18.0 c2c45d506085 17 months ago 133MB
registry.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 18 months ago 683kB
mysql 5.7.31 42cdba9f1b08 23 months ago 448MB
redis 6.0.6 1319b1eaa0b7 2 years ago 104MB
redis 5.0.5 63130206b0fa 3 years ago 98.2MB
mysql 5.7.27 383867b75fd2 3 years ago 373MB
nginx 1.14.0 ecc98fc2f376 3 years ago 109MB
[root@node02 ~]$
[root@node02 ~]$docker images | wc -l
10
给磁盘施加压力
[root@node02 ~]$dd if=/dev/zero of=/root/bigfile4.txt bs=1G count=3
[root@node02 ~]$df -h | grep /dev/sda2
/dev/sda2 30G 27G 3.9G 88% /
查看镜像和 pod
# node02 节点 nginx 相关镜像被删除
[root@node02 ~]$docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.22.5 8f8fdd6672d4 9 months ago 104MB
quay.io/coreos/flannel v0.14.0 8522d622299c 16 months ago 67.9MB
registry.aliyuncs.com/google_containers/pause 3.5 ed210e3e4a5b 18 months ago 683kB
# 可以看到 nginx pod 被驱逐到了 node01 节点,所以 node02 节点的 nginx 镜像可以被删除
[root@master ~]#kubectl get pods -o wide | grep nginx
nginx-1.14.0-767cb6d45b-24ckm 1/1 Running 0 2m13s 10.244.1.80 node01 <none> <none>
nginx-1.14.0-767cb6d45b-7h764 0/1 Evicted 0 2m16s <none> node02 <none> <none>
nginx-1.14.0-767cb6d45b-92vwc 0/1 Evicted 0 2m14s <none> node02 <none> <none>
nginx-1.14.0-767cb6d45b-9vng8 0/1 Evicted 0 2m16s <none> node02 <none> <none>
nginx-1.14.0-767cb6d45b-qk9v6 0/1 Evicted 0 2m16s <none> node02 <none> <none>
nginx-1.14.0-767cb6d45b-t84pj 0/1 ContainerStatusUnknown 1 147m 10.244.2.91 node02 <none> <none>
nginx-1.14.0-767cb6d45b-v5hkv 0/1 Evicted 0 2m15s <none> node02 <none> <none>
nginx-1.14.0-767cb6d45b-v958c 0/1 Evicted 0 2m16s <none> node02 <none> <none>
nginx-1.18.0-5b568cd4d9-82fvl 1/1 Running 0 2m18s 10.244.1.79 node01 <none> <none>
nginx-1.18.0-5b568cd4d9-fdsm5 0/1 Completed 0 144m 10.244.2.92 node02 <none> <none>
2. 容器回收
2.1 制造一些 Exited 状态的容器
[root@node02 ~]$docker ps -a | grep Exited
881c8a177e36 82d2d47667cf "docker-entrypoint.s…" 2 minutes ago Exited (137) 4 seconds ago k8s_mysql-headless_mysql-headless-ccc69c947-wvztz_default_abd3c6fd-62d8-4d9e-a6ac-b633500a6df3_2
1c6208af3a4c d445c0adc9a5 "docker-entrypoint.s…" 5 minutes ago Exited (137) 2 minutes ago k8s_rabbitmq-headless_rabbitmq-headless-7bc7c4846b-q7w7j_default_590a8769-2d6e-44e5-9953-653a9a6e7fb8_1
375f9c34a54c e6ee27304b44 "sh -c 'java -jar /…" 2 weeks ago Exited (137) 2 minutes ago k8s_eureka-nodeport_eureka-nodeport-5ff6ddb946-mknsr_default_a302d324-9bfc-4a12-8a21-49d15fa941fb_0
ebf8db871e43 721354d41dae "/docker-entrypoint.…" 2 weeks ago Exited (137) 4 seconds ago k8s_zk2-headless_zk2-headless-7ffc5f4d6d-rqzsg_default_402a8d34-500e-4e21-8697-d6cf2dd0e69b_0
3e996d8f23fc 721354d41dae "/docker-entrypoint.…" 2 weeks ago Exited (137) 4 seconds ago k8s_zk3-headless_zk3-headless-5d97cb7479-d5hbf_default_94fd52af-9713-4584-b055-c507b07f1756_0
2.2 配置容器回收参数
# vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
......
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --maximum-dead-containers-per-container=1 --maximum-dead-containers=2
-------------------------------------------------------------
systemctl daemon-reload && systemctl restart kubelet
2.3 查看 Exited 容器是否回收
等待一分钟
:发现大部分 Exited 容器被回收,只剩下一个。
[root@node02 ~]$docker ps -a | grep Exited
81c9cecc3f37 8522d622299c "cp -f /etc/kube-fla…" 2 weeks ago Exited (0) 2 weeks ago k8s_install-cni_kube-flannel-ds-7dxd9_kube-system_0f38f608-aff7-4e06-8956-eb0fa64b130f_0