Node Template

每个provisioner必须引用一个AWSNodeTemplateAWSNodeTemplate里面定义了AWS相关的配置。本节我们将探讨Node Template相关的特性

首先删除之前创建的deploy及provisioner、awsnodetemplates:

kubectl delete deployment inflate
kubectl delete provisioners.karpenter.sh default
kubectl delete awsnodetemplates.karpenter.k8s.aws default

securityGroupSelector及subnetSelector参数介绍

查看Node Template文档: https://karpenter.sh/docs/concepts/node-templates/

会发现只有securityGroupSelector和subnetSelector参数是要强制设置的:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector: { ... }        # required, discovers tagged subnets to attach to instances
  securityGroupSelector: { ... } # required, discovers tagged security groups to attach to instances
  instanceProfile: "..."         # optional, overrides the node's identity from global settings
  amiFamily: "..."               # optional, resolves a default ami and userdata
  amiSelector: { ... }           # optional, discovers tagged amis to override the amiFamily's default
  userData: "..."                # optional, overrides autogenerated userdata with a merge semantic
  tags: { ... }                  # optional, propagates tags to underlying EC2 resources
  metadataOptions: { ... }       # optional, configures IMDS for the instance
  blockDeviceMappings: [ ... ]   # optional, configures storage devices for the instance
  detailedMonitoring: "..."      # optional, configures detailed monitoring for the instance

  • Provisioner根据subnetSelector中定义的值来查询子网,当有新的节点要创建时,从查询到的对应子网中选择一个,来启动节点
  • Provisioner根据securityGroupSelector来查询安全组,当有新的节点要创建时,从查询到的对应安全组来绑定到节点上。

第一章我们创建Node Template时,是这样设置的:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
  securityGroupSelector:
    kubernetes.io/cluster/${CLUSTER_NAME}: owned
EOF

image-20231028090940094

本节着重介绍下Node Template中其他的参数。

使用自定义AMI - amiFamily

先创建一个自定义的Node Template,使用Ubuntu AMI:

cd ~/environment/karpenter
cat << EOF > custom_ami_node_template.yaml
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: custom

spec:
  subnetSelector:
    alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
  securityGroupSelector:
    kubernetes.io/cluster/${CLUSTER_NAME}: owned
    
  amiFamily: Ubuntu # 这里声明AMI使用Ubuntu
  tags:
    managed-by: "karpenter"
    ami-type: "ubuntu"
    intent: "apps"
EOF
kubectl create -f custom_ami_node_template.yaml 

同时创建一个provisoiner:

cd ~/environment/karpenter
cat << EOF > custom_karpenter_provisioner.yaml
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: custom
spec:
  providerRef:
    name: custom
  ttlSecondsAfterEmpty: 30
  labels:
    eks-immersion-team: default
  requirements:
    - key: "node.kubernetes.io/instance-type"
      operator: In
      values: ["m5.large", "m5.xlarge"]
    - key: "kubernetes.io/arch"
      operator: In
      values: ["amd64"]
  limits:
    resources:
      cpu: "50"
EOF
kubectl -f custom_karpenter_provisioner.yaml create

测试Node Template,创建一个具有5个replica的deployment:

cd ~/environment/karpenter
cat <<EOF > basic-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 5
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
          resources:
            requests:
              cpu: 1
EOF

kubectl apply -f basic-deploy.yaml

查看Karpenter的日志,custom provisoner创建出一台EC2:

kongpingfan:~/environment/karpenter $ kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter | grep controller.provisioner
2023-10-27T14:37:05.237Z        DEBUG   controller.provisioner  209 out of 743 instance types were excluded because they would breach limits  {"commit": "61b3e1e-dirty", "provisioner": "custom"}
2023-10-27T14:37:05.266Z        INFO    controller.provisioner  found provisionable pod(s)      {"commit": "61b3e1e-dirty", "pods": "default/inflate-5947bd9774-np687, default/inflate-5947bd9774-2dwj2, default/inflate-5947bd9774-nn5x6", "duration": "29.025095ms"}
2023-10-27T14:37:05.266Z        INFO    controller.provisioner  computed new machine(s) to fit pod(s)   {"commit": "61b3e1e-dirty", "machines": 1, "pods": 3}
2023-10-27T14:37:05.278Z        INFO    controller.provisioner  created machine {"commit": "61b3e1e-dirty", "provisioner": "custom", "machine": "custom-kfb6s", "requests": {"cpu":"3155m","memory":"120Mi","pods":"7"}, "instance-types": "m5.xlarge"}

查看EC2页面,新启动的节点系统确认是ubuntu:

image-20231027224421351

未能分配的pod被分配到这个节点上:

image-20231027223904635

测试完成后,把deployment、provisioner、node template一起删除:

kubectl delete -f basic-deploy.yaml
kubectl delete -f custom_ami_node_template.yaml
kubectl delete -f custom_karpenter_provisioner.yaml

blockdeviceMappings参数

Node Template中的blockdeviceMappings字段用于指定节点的EBS参数。

在实际生产环境中,我们可能遇到下面场景:

  • 挂载一块额外的盘,用于存储业务日志或容器镜像
  • 调整EBS根卷的大小、IOPS、加密方式

这些场景都可以通过blockdeviceMappings来设置,例如:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
spec:
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 20Gi
        volumeType: gp3

测试

进入到karpenter项目目录下:

cd ~/environment/karpenter  || mkdir -p ~/environment/karpenter  && cd ~/environment/karpenter 

我们使用blockdeviceMappings参数来创建两块ebs盘,并使用userData来执行脚本,在节点启动的时候检测数据卷并挂载到/data$N目录下。创建一个basic-deploy-blockdevice.yaml 文件,内容如下,注意将第34和36行的${CLUSTER_NAME}做替换:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  providerRef:
    name: default
  ttlSecondsAfterEmpty: 30

  labels:
    eks-immersion-team: my-team

  requirements:
    - key: "karpenter.k8s.aws/instance-category"
      operator: In
      values: ["c", "m", "r"]
    - key: "kubernetes.io/arch"
      operator: In
      values: ["amd64"]
    - key: "karpenter.sh/capacity-type" # If not included, the webhook for the AWS cloud provider will default to on-demand
      operator: In
      values: ["on-demand"]
  limits:
    resources:
      cpu: "10"
    
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
  securityGroupSelector:
    kubernetes.io/cluster/${CLUSTER_NAME}: owned
    
  blockDeviceMappings:
    - deviceName: /dev/xvda # root Volume to store OS Binaries
      ebs:
        volumeType: gp3 # EBS Volume Type
        volumeSize: 20Gi # Size of the disk
        deleteOnTermination: true # Disk Retention Policy
    - deviceName: /dev/xvdb # Data Volume to store Images, Logs etc
      ebs:
        volumeType: gp3 # EBS Volume Type
        volumeSize: 100Gi # Size of the disk
        deleteOnTermination: true # Disk Retention Policy
  userData: |
    #!/bin/bash -xe
    exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1

    # Mount data volumes to /data$N directory on a Amazon Linux Worker Node (excluding OS volume)
    
    # Identify the device name of the root volume
    root_device=$(mount | awk '$3 == "/" {print $1}')
    
    # Identify the device names of all attached block devices (excluding root volume)
    device_names=$(lsblk -d -n -o NAME | grep -v "$root_device")
    
    # Loop through each device name and mount the corresponding volume to a directory named /data$N
    i=1
    for device_name in $device_names; do
      if ! grep -qs "/dev/$device_name" /proc/mounts; then
        sudo mkfs.xfs "/dev/$device_name"
        sudo mkdir -p "/data${i}"
        sudo mount "/dev/$device_name" "/data${i}"
        echo "Mounted /dev/$device_name to /data${i}"
        ((i++))
      fi
    done
        
  tags:
    managed-by: "karpenter"
    intent: “apps"

部署Node Template和Provisioner:

kubectl apply -f basic-deploy-blockdevice.yaml

provisoner和node template创建完成后,继续创建inflate应用,由于它的nodeSelector = eks-immersion-team: my-team,所以会触发provisoner创建出一个新的节点:

cd ~/environment/karpenter  || mkdir -p ~/environment/karpenter  && cd ~/environment/karpenter 
cat > basic-app-deploy.yaml <<'EOF' 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 1
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: nginx
          volumeMounts:
          - name: my-app-volume
            mountPath: /var/log/nginx
          resources:
            requests:
              cpu: 1
      volumes:
      - name: my-app-volume
        hostPath:
          path: /data1
      nodeSelector:
        eks-immersion-team: my-team
EOF

kubectl apply -f basic-app-deploy.yaml

注意上面的pod将/var/log/nginx挂载到本地磁盘上了。

等待一段时间,确认新的节点被拉起:

kubectl get nodes -l eks-immersion-team=my-team

image-20231027230947145

根据这个NAME在EC2控制台找到这台机器,看到它确实有两个EBS:

image-20231027231110682

使用Session Manager连接到EC2:

image-20231027231318723

执行df -h, 看到data1已经挂载上去:

image-20231028084027351

进入到/data1目录,执行ls命令会发现有两个文件error.logaccess.log, 这是nginx pod的日志,查看具体内容:

image-20231028084140506


最后清理node template和provisioner:

kubectl delete -f basic-app-deploy.yaml

kubectl delete -f basic-deploy-blockdevice.yaml

spec.tags

Karpenter默认会将这些tag加到创建出来的EC2、EBS:

Name: karpenter.sh/provisioner-name/<provisioner-name>
karpenter.sh/provisioner-name: <provisioner-name>
kubernetes.io/cluster/<cluster-name>: owned

使用spec.tags能添加额外的tag:

spec:
  tags:
    InternalAccountingTag: 1234
    dev.corp.net/app: Calculator
    dev.corp.net/team: MyTeam

可以对Name标签进行覆盖,但是不能对“karpenter.sh”, “karpenter.k8s.aws”, “kubernetes.io/cluster这些进行覆盖,因为Karpenter要用它们来自动发现机器。