首先删除之前创建的deploy及provisioner、awsnodetemplates:
kubectl delete deployment inflate
kubectl delete nodepools.karpenter.sh default
kubectl delete ec2nodeclasses.karpenter.k8s.aws default
文档参考: https://karpenter.sh/docs/concepts/nodeclasses/
会发现只有securityGroupSelector
、subnetSelector
和amiFamily
参数是要强制设置的:
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
subnetSelector: { ... } # required, discovers tagged subnets to attach to instances
securityGroupSelector: { ... } # required, discovers tagged security groups to attach to instances
instanceProfile: "..." # optional, overrides the node's identity from global settings
amiFamily: "..." # required, resolves a default ami and userdata
amiSelectorTerms: { ... } # optional, discovers tagged amis to override the amiFamily's default
userData: "..." # optional, overrides autogenerated userdata with a merge semantic
tags: { ... } # optional, propagates tags to underlying EC2 resources
metadataOptions: { ... } # optional, configures IMDS for the instance
blockDeviceMappings: [ ... ] # optional, configures storage devices for the instance
detailedMonitoring: "..." # optional, configures detailed monitoring for the instance
第一章我们创建NodeClass时,是这样设置的:
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: karpenterNodeRole-$CLUSTER_NAME
securityGroupSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
subnetSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
tags:
intent: apps
managed-by: karpenter
本节着重介绍下NodeClass中其他的参数。
先创建一个自定义的NodeClass
,使用AL2023 AMI:
cd ~/environment/karpenter
cat << EoF > custom_ami_node_class.yaml
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: custom
spec:
amiFamily: AL2023
role: "KarpenterNodeRole-${CLUSTER_NAME}"
securityGroupSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
subnetSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
tags:
ami-type: al2023
intent: apps
managed-by: karpenter
EoF
kubectl -f custom_ami_node_class.yaml create
所有支持的AMI在 https://karpenter.sh/docs/concepts/nodeclasses/#specamifamily 可以找到:
同时创建一个NodePool来使用它:
cd ~/environment/karpenter
cat << EoF > custom_karpenter_nodepool.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: custom
spec:
disruption:
consolidateAfter: 30s
consolidationPolicy: WhenEmpty
expireAfter: Never
limits:
cpu: "50"
template:
metadata:
labels:
eks-immersion-team: default
spec:
nodeClassRef:
name: custom
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values:
- m5.large
- m5.xlarge
- key: kubernetes.io/arch
operator: In
values:
- amd64
- key: kubernetes.io/os
operator: In
values:
- linux
- key: karpenter.sh/capacity-type
operator: In
values:
- on-demand
EoF
kubectl -f custom_karpenter_nodepool.yaml create
测试NodeClass,创建一个具有5个replica的deployment:
cd ~/environment/karpenter
cat <<EOF > basic-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 5
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
nodeSelector:
eks-immersion-team: default
EOF
kubectl apply -f basic-deploy.yaml
Karpenter会拉起机器部署这五个pod:
查看EC2页面,新启动的节点系统确认是AL2023:
测试完成后,把deployment、nodepool、nodeclass一起删除:
kubectl delete -f basic-deploy.yaml
kubectl delete -f custom_karpenter_nodepool.yaml
kubectl delete -f custom_ami_node_class.yaml
NodeClass中的blockdeviceMappings
字段用于指定节点的EBS参数。
在实际生产环境中,我们可能遇到下面场景:
这些场景都可以通过blockdeviceMappings来设置,例如:
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
spec:
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 20Gi
volumeType: gp3
我们使用blockdeviceMappings参数来创建两块ebs盘,并使用userData来执行脚本,在节点启动的时候检测数据卷并挂载到/data$N
目录下。创建一个basic-deploy-blockdevice.yaml
文件,内容如下,注意将第34和36行的${CLUSTER_NAME}
做替换:
cd ~/environment/karpenter || mkdir -p ~/environment/karpenter && cd ~/environment/karpenter
cat > basic-deploy-blockdevice.yaml <<EoF
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
disruption:
consolidateAfter: 30s
consolidationPolicy: WhenEmpty
expireAfter: Never
limits:
cpu: "10"
template:
metadata:
labels:
eks-immersion-team: my-team
spec:
nodeClassRef:
name: default
requirements:
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type # If not included, the webhook for the AWS cloud provider will default to on-demand
operator: In
values: ["on-demand"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: karpenterNodeRole-$CLUSTER_NAME
securityGroupSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
subnetSelectorTerms:
- tags:
alpha.eksctl.io/cluster-name: $CLUSTER_NAME
blockDeviceMappings:
- deviceName: /dev/xvda # root Volume to store OS Binaries
ebs:
volumeType: gp3 # EBS Volume Type
volumeSize: 20Gi # Size of the disk
deleteOnTermination: true # Disk Retention Policy
- deviceName: /dev/xvdb # Data Volume to store Images, Logs etc
ebs:
volumeType: gp3 # EBS Volume Type
volumeSize: 100Gi # Size of the disk
deleteOnTermination: true # Disk Retention Policy
userData: |
#!/bin/bash
# Mount data volumes to /data$N directory on a Amazon Linux Worker Node (excluding OS volume)
# Identify the device name of the root volume
root_device=\$(mount | awk '\$3 == "/" {print \$1}')
# Identify the device names of all attached block devices (excluding root volume)
device_names=\$(lsblk -d -n -o NAME | grep -v "\$root_device")
# Loop through each device name and mount the corresponding volume to a directory named /data$N
i=1
for device_name in \$device_names; do
if ! grep -qs "/dev/\$device_name" /proc/mounts; then
sudo mkfs.xfs "/dev/\$device_name"
sudo mkdir -p "/data\${i}"
sudo mount "/dev/\$device_name" "/data\${i}"
echo "Mounted /dev/\$device_name to /data\${i}"
((i++))
fi
done
tags:
intent: apps
managed-by: karpenter
EoF
kubectl apply -f basic-deploy-blockdevice.yaml
部署NodePool:
kubectl apply -f basic-deploy-blockdevice.yaml
NodePool和NodeClass创建完成后,继续创建inflate应用,由于它的nodeSelector = eks-immersion-team: my-team
,所以会触发NodePool创建出一个新的节点:
cd ~/environment/karpenter || mkdir -p ~/environment/karpenter && cd ~/environment/karpenter
cat > basic-app-deploy.yaml <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 1
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: nginx
volumeMounts:
- name: my-app-volume
mountPath: /var/log/nginx
resources:
requests:
cpu: 1
volumes:
- name: my-app-volume
hostPath:
path: /data1
nodeSelector:
eks-immersion-team: my-team
EOF
kubectl apply -f basic-app-deploy.yaml
注意上面的pod将
/var/log/nginx
挂载到本地磁盘上了。
等待一段时间,确认新的节点被拉起:
kubectl get nodes -l eks-immersion-team=my-team
根据这个NAME在EC2控制台找到这台机器,看到它确实有两个EBS:
使用Session Manager连接到EC2:
执行df -h
, 看到data1
已经挂载上去:
进入到/data1
目录,执行ls命令会发现有两个文件error.log
和access.log
, 这是nginx pod的日志,查看具体内容:
最后清理nodepool和deployment:
kubectl delete -f basic-app-deploy.yaml
kubectl delete -f basic-deploy-blockdevice.yaml
Karpenter默认会将这些tag加到创建出来的EC2、EBS:
Name: karpenter.sh/provisioner-name/<provisioner-name>
karpenter.sh/provisioner-name: <provisioner-name>
kubernetes.io/cluster/<cluster-name>: owned
使用spec.tags
能添加额外的tag:
spec:
tags:
InternalAccountingTag: 1234
dev.corp.net/app: Calculator
dev.corp.net/team: MyTeam
可以对Name标签进行覆盖,但是不能对“karpenter.sh”, “karpenter.k8s.aws”, “kubernetes.io/cluster
这些进行覆盖,因为Karpenter要用它们来自动发现机器。