如果设置了PDB,Karpenter会使用指数回退的驱逐策略,Pod不会被强制删除,从而阻止删除节点的操作。PDB指定了 Deployment, ReplicationController, ReplicaSet, StatefulSet 中最小运行的Pod数量,防止Pod被大量驱逐,保证线上应用的平滑运行。
在本节,我们将PDB设置为最小4个Pod运行,看Karpenter在遇到冲突时的行为。
先删除上一节创建的资源:
kubectl delete deployment inflate
kubectl delete provisioners.karpenter.sh default
kubectl delete awsnodetemplates.karpenter.k8s.aws default
为2个AZ设置环境变量:
export AZ1="$AWS_REGION"b
export AZ2="$AWS_REGION"c
部署Provisioner和Node template,provisioner中的实例CPU设置为小于3,并部署在AZ2:
mkdir -p ~/environment/karpenter
cd ~/environment/karpenter
cat <<EoF> pdb-provisioner.yaml
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
# References cloud provider-specific custom resource, see your cloud provider specific documentation
providerRef:
name: default
ttlSecondsAfterEmpty: 30
# Labels are arbitrary key-values that are applied to all nodes
labels:
eks-immersion-team: my-team
# Requirements that constrain the parameters of provisioned nodes.
# These requirements are combined with pod.spec.affinity.nodeAffinity rules.
# Operators { In, NotIn } are supported to enable including or excluding values
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m"]
- key: "karpenter.k8s.aws/instance-cpu"
operator: Lt
values: ["3"]
- key: "topology.kubernetes.io/zone"
operator: In
values: ["$AZ2"]
- key: "kubernetes.io/arch"
operator: In
values: ["amd64"]
- key: "karpenter.sh/capacity-type" # If not included, the webhook for the AWS cloud provider will default to on-demand
operator: In
values: ["on-demand"]
limits:
resources:
cpu: "20"
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: default
spec:
subnetSelector:
alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
securityGroupSelector:
aws:eks:cluster-name: ${CLUSTER_NAME}
tags:
managed-by: "karpenter"
intent: "apps"
EoF
kubectl apply -f pdb-provisioner.yaml
先创建一个PDB,它匹配app: inflate
标签,并且在pod数量小于4时,不允许继续驱逐pod,从而阻碍节点回收:
cd ~/environment/karpenter
cat <<EoF> pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: inflate-pdb
spec:
minAvailable: 4
selector:
matchLabels:
app: inflate
EoF
kubectl apply -f pdb.yaml
创建6个replica,每个容器有1G内存和1个CPU:
cd ~/environment/karpenter
cat <<EoF> pdb-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 6
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: app
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
memory: 1Gi
cpu: 1
nodeSelector:
eks-immersion-team: my-team
EoF
kubectl apply -f pdb-deploy.yaml
可以看到Karpenter在AZ2创建出来6个节点:
执行下面的命令,也能确认6个pod在6个节点上:
kongpingfan:~/environment/karpenter $ kubectl get pod -o=custom-columns=NODE:.spec.nodeName,NAME:.metadata.name
NODE NAME
ip-192-168-49-6.us-west-2.compute.internal inflate-754f46b654-4bl42
ip-192-168-141-71.us-west-2.compute.internal inflate-754f46b654-62mld
ip-192-168-128-83.us-west-2.compute.internal inflate-754f46b654-8h7bl
ip-192-168-133-59.us-west-2.compute.internal inflate-754f46b654-gwjj5
ip-192-168-49-199.us-west-2.compute.internal inflate-754f46b654-h7qd4
ip-192-168-52-131.us-west-2.compute.internal inflate-754f46b654-k669m
现在我们将从EKS集群中驱逐其中一个节点,Karpenter允许我们驱逐它并会创建一个新的节点:
kubectl drain --ignore-daemonsets $(kubectl get nodes -l "eks-immersion-team" -o name | tail -n1)
查看Karpenter日志:
kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter
可以看到Karpenter先拉起一台机器,然后再把被驱逐的机器删除掉:
现在我们尝试一次驱逐三个节点,由于pod数量此时会小于PDB设置的4,所以会报错:
驱逐前两个节点的时候没问题,但到第三个的时候就会报错,当几次重试之后,PDB重新被满足(Karpenter拉起来新的节点并部署pod),第三个节点终于被驱逐。
最终重新达到平衡状态,但在这个过程中,Karpenter先启动了两个节点(同时干掉两个),再启动一个节点(干掉最后一个):