在本节中,我们将了解Karpenter如何在按需和spot实例上运行工作负载,并根据所需的比例来保证按需节点的基本可用性,同时利用spot实例来优化成本。
首先清理之前创建的资源:
kubectl delete deployment inflate
kubectl delete nodepool.karpenter.sh default
kubectl delete ec2nodeclass.karpenter.k8s.aws default
让我们部署两个NodePool,它们将利用Karpenter的能力根据所需的比例在OD和Spot实例之间拆分工作负载。
mkdir ~/environment/karpenter
cd ~/environment/karpenter
cat <<EoF> ratiosplit.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: on-demand
spec:
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: Never
limits:
cpu: "100"
template:
metadata:
labels:
eks-immersion-team: my-team
spec:
nodeClassRef:
name: default
requirements:
- key: karpenter.k8s.aws/instance-category
operator: In
values:
- c
- m
- r
- key: capacity-spread
operator: In
values:
- "1"
- key: kubernetes.io/arch
operator: In
values:
- amd64
- key: karpenter.sh/capacity-type
operator: In
values:
- on-demand
- key: kubernetes.io/os
operator: In
values:
- linux
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: spot
spec:
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: Never
limits:
cpu: "100"
template:
metadata:
labels:
eks-immersion-team: my-team
spec:
nodeClassRef:
name: default
requirements:
- key: karpenter.k8s.aws/instance-category
operator: In
values:
- c
- m
- r
- key: capacity-spread
operator: In
values:
- "2"
- "3"
- "4"
- "5"
- key: kubernetes.io/arch
operator: In
values:
- amd64
- key: karpenter.sh/capacity-type
operator: In
values:
- spot
- key: kubernetes.io/os
operator: In
values:
- linux
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: "KarpenterNodeRole-${CLUSTER_NAME}"
securityGroupSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
subnetSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
tags:
intent: apps
managed-by: karpenter
eks-immersion-team: my-team
EoF
kubectl apply -f ratiosplit.yaml
nodepool.karpenter.sh/on-demand created
nodepool.karpenter.sh/spot created
ec2nodeclass.karpenter.k8s.aws/default created
让我们部署一个有5个replica的应用。使用capacity-spread
标签,我们将节点平均分布在此标签上,最终将得到4:1的Spot: OnDemand节点的比例:
cd ~/environment/karpenter
cat <<EoF> capacity-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 5
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: capacity-spread
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: inflate
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
nodeSelector:
eks-immersion-team: my-team
EoF
kubectl apply -f capacity-deploy.yaml
Karpenter在5个节点上调度了5个应用pod。有1个OD节点和4个Spot节点:
让我们将部署扩展到10个副本:
kubectl scale deployment inflate --replicas 10
deployment.apps/inflate scaled
Karpenter会新创建5个新节点,总共有10个节点。这5个节点的比例为4:1的Spot: OD。因此,我们应该看到对于这10个节点,有2个按需节点和8个Spot节点:
删除deployment、NodePool和EC2NodeClass资源:
kubectl delete deployment inflate
kubectl delete nodepool on-demand
kubectl delete nodepool spot
kubectl delete ec2nodeclasses.karpenter.k8s.aws default