在K8s中,taints和toleration用于控制pod调度在哪个节点上。taints是打在节点上的标签,只有声明了toleration的pod才能被调度到上面。本节将在Provisioner上打上taint标签
首先删除之前创建的资源:
kubectl delete deployment inflate
kubectl delete provisioners.karpenter.sh default
kubectl delete awsnodetemplates.karpenter.k8s.aws default
创建Provisioner,注意它打了key: systemnodes
标签:
mkdir ~/environment/karpenter
cd ~/environment/karpenter
cat <<EoF> taint.yaml
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
# References cloud provider-specific custom resource, see your cloud provider specific documentation
providerRef:
name: default
ttlSecondsAfterEmpty: 30
# Labels are arbitrary key-values that are applied to all nodes
labels:
eks-immersion-team: my-team
# Requirements that constrain the parameters of provisioned nodes.
# These requirements are combined with pod.spec.affinity.nodeAffinity rules.
# Operators { In, NotIn } are supported to enable including or excluding values
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"]
- key: "kubernetes.io/arch"
operator: In
values: ["amd64"]
- key: "karpenter.sh/capacity-type" # If not included, the webhook for the AWS cloud provider will default to on-demand
operator: In
values: ["on-demand"]
taints:
- key: systemnodes
effect: NoSchedule
limits:
resources:
cpu: "10"
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: default
spec:
subnetSelector:
alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
securityGroupSelector:
aws:eks:cluster-name: ${CLUSTER_NAME}
tags:
managed-by: "karpenter"
intent: "apps"
EoF
kubectl apply -f taint.yaml
部署应用,这个应用上没有打toleration对应的标签:
cd ~/environment/karpenter
cat <<EoF> taint-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 3
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
nodeSelector:
eks-immersion-team: my-team
EoF
kubectl apply -f taint-deploy.yaml
我们观察到一段时间后,pod依然是pending状态:
查看karpenter的日志,发现Karpenter不能创建出对应的node来,因为pod不满足tolerate条件:
kongpingfan:~/environment/karpenter $ kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter
2023-10-28T02:50:59.607Z DEBUG controller.provisioner 444 out of 743 instance types were excluded because they would breach limits {"commit": "61b3e1e-dirty", "provisioner": "default"}
2023-10-28T02:50:59.607Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", daemonset overhead={"cpu":"155m","memory":"120Mi","pods":"4"}, did not tolerate systemnodes=:NoSchedule {"commit": "61b3e1e-dirty", "pod": "default/inflate-6c7f87df76-jhz5g"}
2023-10-28T02:50:59.607Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", daemonset overhead={"cpu":"155m","memory":"120Mi","pods":"4"}, did not tolerate systemnodes=:NoSchedule {"commit": "61b3e1e-dirty", "pod": "default/inflate-6c7f87df76-d2p57"}
2023-10-28T02:50:59.607Z ERROR controller.provisioner Could not schedule pod, incompatible with provisioner "default", daemonset overhead={"cpu":"155m","memory":"120Mi","pods":"4"}, did not tolerate systemnodes=:NoSchedule {"commit": "61b3e1e-dirty", "pod": "default/inflate-6c7f87df76-glvjm"}
我们将应用重新设置toleration,然后部署:
cd ~/environment/karpenter
cat <<EoF> taint-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 3
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
tolerations:
- key: "systemnodes"
operator: "Exists"
effect: "NoSchedule"
nodeSelector:
eks-immersion-team: my-team
EoF
kubectl apply -f taint-deploy.yaml
等待一段时间后,karpenter开出新的节点来部署pod: