Pod 亲和性是指将新 pod 调度到已经拥有一个或多个满足特定条件的 pod 的节点上的能力。 Karpenter 通过为节点定义 pod 亲和性规则来支持 pod 亲和性。 在本节我们将了解如何使用 podAffinity 确保将前端 pod 部署在后端 pod 所在的同一区域中。 这里我们展示了 podAffinity 的示例,但它对于 podAntiAffinity 的工作原理也相同(允许将新的 pod 调度到不具有任何满足特定条件的现有 pod 的节点上)。
首先删除前面章节创建的资源:
kubectl delete deployment inflate
kubectl delete nodepools.karpenter.sh default
kubectl delete ec2nodeclasses.karpenter.k8s.aws default
部署NodePool:
mkdir ~/environment/karpenter
cd ~/environment/karpenter
cat <<EoF> podaffinity.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: Never
limits:
cpu: "20"
template:
metadata:
labels:
eks-immersion-team: my-team
spec:
nodeClassRef:
name: default
# Requirements that constrain the parameters of provisioned nodes.
# These requirements are combined with pod.spec.affinity.nodeAffinity rules.
# Operators { In, NotIn } are supported to enable including or excluding values
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"]
- key: "kubernetes.io/arch"
operator: In
values: ["amd64"]
- key: "karpenter.sh/capacity-type" # If not included, the webhook for the AWS cloud provider will default to on-demand
operator: In
values: ["on-demand"]
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: "KarpenterNodeRole-${CLUSTER_NAME}"
securityGroupSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
subnetSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
tags:
intent: apps
managed-by: karpenter
EoF
kubectl apply -f podaffinity.yaml
export AZ1="$AWS_REGION"b
export AZ2="$AWS_REGION"c
我们将部署两个应用(backend
和inflate
), backend
应用设置了nodeAffinity,要部署在AZ2, inflate
应用要部署在AZ1:
cd ~/environment/karpenter
cat <<EoF> nodeaffinity-pod-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
spec:
replicas: 2
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "topology.kubernetes.io/zone"
operator: "In"
values: ["$AZ2"]
terminationGracePeriodSeconds: 0
containers:
- name: backend
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
nodeSelector:
eks-immersion-team: my-team
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 2
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "topology.kubernetes.io/zone"
operator: "In"
values: ["$AZ1"]
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
nodeSelector:
eks-immersion-team: my-team
EoF
kubectl apply -f nodeaffinity-pod-deploy.yaml
这样Karpenter会在两个AZ各拉起一台机器:
且两台机器上各跑着两个pod:
kongpingfan:~/environment/karpenter $ kubectl get pod -o=custom-columns=NODE:.spec.nodeName,NAME:.metadata.name
NODE NAME
ip-192-168-130-42.us-west-2.compute.internal backend-7fb8544cf9-6n5mq
ip-192-168-130-42.us-west-2.compute.internal backend-7fb8544cf9-z547g
ip-192-168-189-213.us-west-2.compute.internal inflate-7c56688b5d-bwm76
ip-192-168-189-213.us-west-2.compute.internal inflate-7c56688b5d-cwmxz
现在部署frontend应用,frontend应用需要跟backend应用部署在同一个AZ(为了减少跨AZ访问流量)。所以我们为它添加podAffinity:
cd ~/environment/karpenter
cat <<EoF> podaffinity-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
spec:
replicas: 2
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: backend
topologyKey: topology.kubernetes.io/zone
terminationGracePeriodSeconds: 0
containers:
- name: frontend
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
nodeSelector:
eks-immersion-team: my-team
EoF
kubectl apply -f podaffinity-deploy.yaml
可以看到frontend的pod全部和backend在同一个AZ,并且分布在两个节点上(因为第一个节点的资源已经不够用了,要创建出来一个新的节点):
kongpingfan:~/environment/karpenter $ kubectl get pod -o=custom-columns=NODE:.spec.nodeName,NAME:.metadata.name
NODE NAME
ip-192-168-130-42.us-west-2.compute.internal backend-7fb8544cf9-6n5mq
ip-192-168-130-42.us-west-2.compute.internal backend-7fb8544cf9-z547g
ip-192-168-36-213.us-west-2.compute.internal frontend-8547476cdc-krdtn
ip-192-168-130-42.us-west-2.compute.internal frontend-8547476cdc-tv59n
ip-192-168-189-213.us-west-2.compute.internal inflate-7c56688b5d-bwm76
ip-192-168-189-213.us-west-2.compute.internal inflate-7c56688b5d-cwmxz