為什麼 EKS worker node 可以自動加入 EKS cluster(一)

為什麼 EKS worker node 可以自動加入 EKS cluster(一)
Photo by Bogdan Farca / Unsplash

如果是一個自建的 Kubernetes cluster,我們會需要使用 Certificate Authority 憑證給予 Kubernetes Components [1] 所使用,如 Kubernetes The Hard Way [2] 或是以下 kubeadm join [3] 命令則可以使 Kubernetes node 加入 cluster:

$ kubeadm join --discovery-token abcdef.1234567890abcdef --discovery-token-ca-cert-hash sha256:1234..cdef 1.2.3.4:6443

在 EKS 環境中,nodegroup 一詞代表了 Kubernetes nodes 集合,而我們也可以使用 eksctl scale [4] 命令 scale up Nodes。

$ eksctl scale ng --nodes=3 --nodes-max=10 --name=ng1-public-ssh --cluster=ironman-2022
2022-09-18 14:14:18 [ℹ]  scaling nodegroup "ng1-public-ssh" in cluster ironman-2022
2022-09-18 14:14:18 [ℹ]  waiting for scaling of nodegroup "ng1-public-ssh" to complete
2022-09-18 14:14:48 [ℹ]  nodegroup successfully scaled

而在短短幾分鐘內,我們並無需要手動設定 TLS 憑證,EKS node 則可以順利自動加入集群。

$ kubectl get node
NAME                                           STATUS   ROLES    AGE     VERSION
...
ip-192-168-40-16.eu-west-1.compute.internal    Ready    <none>   34s     v1.22.12-eks-ba74326
...

故將探討「為什麼 EKS 節點可以自動加入集群」,希望理解 EKS node 預設了哪些設置允許 node 可以自動加入 cluster。本文將著重於:

  • 何謂 EKS node group?
  • EKS AMI bootstrap script 預設了哪些設定?

EKS node group

一個完整的 Kubernetes Components [1] 可以分成為 Control Plane 端及 Node 端。其中 在 EKS cluster 環境中,AWS 服務整合了 EC2、Auto Scaling groups 等服務的 worker node 資源稱之為 nodegroup(節點組)。根據 EKS Node 文件,又可以分成 Managed node groups(託管節點組) [6] 及 Self-managed nodes(自行管理節點組) [7]。

第一天eksctl ClusterConfig 文件定義使用了 Managed node groups ng1-public-ssh

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: ironman-2022
  region: eu-west-1

managedNodeGroups:
  - name: "ng1-public-ssh"
    desiredCapacity: 2
    ssh:
      # Enable ssh access (via the admin container)
      allow: true
      publicKeyName: "ironman-2022"
    iam:
      withAddonPolicies:
        ebs: true
        fsx: true
        efs: true
        awsLoadBalancerController: true
        autoScaler: true

iam:
  withOIDC: true

cloudWatch:
  clusterLogging:
    enableTypes: ["*"]

在建立完環境後,我們會有名為 ng1-public-ssh 的 Managed node groups:

$ eksctl get ng --cluster=ironman-2022
CLUSTER NODEGROUP       STATUS  CREATED                 MIN SIZE        MAX SIZE        DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID        ASG NAME                    TYPE
ironman-2022 ng1-public-ssh  ACTIVE  2022-09-14T09:54:36Z    0               10              3                       m5.large        AL2_x86_64      eks-ng1-public-ssh-62c19db5-f965-bdb7-373a-147e04d9f124      managed

因此我們也可以透過 aws eks describe-nodegroup [8] 命令查看 nodegroup 資源資訊:

$ aws eks describe-nodegroup --cluster-name ironman-2022 --nodegroup-name ng1-public-ssh
{
    "nodegroup": {
        "nodegroupName": "ng1-public-ssh",
        "nodegroupArn": "arn:aws:eks:eu-west-1:111111111111:nodegroup/ironman-2022/ng1-public-ssh/62c19db5-f965-bdb7-373a-147e04d9f124",
        "clusterName": "ironman-2022",
        "version": "1.22",
        "releaseVersion": "1.22.12-20220824",
        "createdAt": "2022-09-14T09:54:36.211000+00:00",
        "modifiedAt": "2022-09-18T13:45:07.322000+00:00",
        "status": "ACTIVE",
        "capacityType": "ON_DEMAND",
        "scalingConfig": {
            "minSize": 0,
            "maxSize": 2,
            "desiredSize": 2
        },
        "instanceTypes": [
            "m5.large"
        ],
        "subnets": [
            "subnet-0e863f9fbcda592a3",
            "subnet-00ebeb2e8903fb3f9",
            "subnet-02d98be342d8ab2a7"
        ],
        "amiType": "AL2_x86_64",
        "nodeRole": "arn:aws:iam::111111111111:role/eksctl-ironman-2022-nodegroup-ng1-publ-NodeInstanceRole-HN27OZ18JS6U",
        "labels": {
            "alpha.eksctl.io/cluster-name": "ironman-2022",
            "alpha.eksctl.io/nodegroup-name": "ng1-public-ssh"
        },
        "resources": {
            "autoScalingGroups": [
                {
                    "name": "eks-ng1-public-ssh-62c19db5-f965-bdb7-373a-147e04d9f124"
                }
            ]
        },
        "health": {
            "issues": []
        },
        "updateConfig": {
            "maxUnavailable": 1
        },
        "launchTemplate": {
            "name": "eksctl-ironman-2022-nodegroup-ng1-public-ssh",
            "version": "1",
            "id": "lt-000b4417a7baebbf8"
        },
        "tags": {
            "aws:cloudformation:stack-name": "eksctl-ironman-2022-nodegroup-ng1-public-ssh",
            "alpha.eksctl.io/cluster-name": "ironman-2022",
            "alpha.eksctl.io/nodegroup-name": "ng1-public-ssh",
            "aws:cloudformation:stack-id": "arn:aws:cloudformation:eu-west-1:111111111111:stack/eksctl-ironman-2022-nodegroup-ng1-public-ssh/2b5ad730-3413-11ed-9adb-0296792ac05b",
            "auto-delete": "never",
            "eksctl.cluster.k8s.io/v1alpha1/cluster-name": "ironman-2022",
            "aws:cloudformation:logical-id": "ManagedNodeGroup",
            "alpha.eksctl.io/nodegroup-type": "managed",
            "alpha.eksctl.io/eksctl-version": "0.111.0"
        }
    }
}

由上述輸出內容 nodegroup 資源包含了此 Node 使用的 IAM role、Auto Scaling Group 名稱、Launch Template 及 CloudFormation 資訊。其中我們可以得知 EKS 預設使用 AMI 為: AL2_x86_64,此為 AWS 基於 Amazon Linux 2 AMI 維護 Amazon EKS optimized Amazon Linux AMIs [9] 。

Amazon EKS AMI Build Specification

使用 aws ec2 describe-instance-attribute 命令查看 EC2 userdata [11] 並透過 base64 命令解析:

$ aws ec2 describe-instance-attribute --instance-id i-1234567890abcdef0 --attribute userData | jq -r .UserData.Value | base64 -d
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"

--//
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
set -ex
B64_CLUSTER_CA=LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHV... SKIP    ...
... CA      ...
... CONTENT ...
kZtcnI5V1lZRExMUDdLCm0xVUJQUWdzTzRQQlREUjlaLzhpbnZDV0FiT0szM2Z6OVZqU3dBbjlhQ0lXbU5FY2dVMkFUWm1FN0N4WEUrbFkKOFhjPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
API_SERVER_URL=https://1234567890ABCDEFGHIJKLMNOPQRSTUV.gr7.eu-west-1.eks.amazonaws.com
K8S_CLUSTER_DNS_IP=10.100.0.10
/etc/eks/bootstrap.sh ironman-2022 --kubelet-extra-args '--node-labels=eks.amazonaws.com/sourceLaunchTemplateVersion=1,alpha.eksctl.io/nodegroup-name=ng1-public-ssh,alpha.eksctl.io/cluster-name=ironman-2022,eks.amazonaws.com/nodegroup-image=ami-0ec9e1727a24fb788,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup=ng1-public-ssh,eks.amazonaws.com/sourceLaunchTemplateId=lt-000b4417a7baebbf8 --max-pods=29' --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL --dns-cluster-ip $K8S_CLUSTER_DNS_IP --use-max-pods false

--//--

我們可以得知 bash script 定義:

  • B64_CLUSTER_CAAPI_SERVER_URLK8S_CLUSTER_DNS_IP 環境變數。
  • 執行了於 /etc/eks/bootstrap.sh 路徑 script。

根據 Amazon EKS optimized Amazon Linux AMI build script [12] 提及了 GitHub Amazon EKS AMI Build Specification [13] 定義了 EKS node bootstrap script 包含 certificate data、control plane endpoint、cluster 名稱等資訊。

而預設的 userdata script 則定義了以下幾個參數:

  • cluster 名稱。
  • --kubelet-extra-args:定義額外 kubelet 參數,方便增加設定 labels 或 taints。
  • --b64-cluster-ca:EKS cluster based64 編碼後的內容。此部分透過 AWS CLI  aws eks describe-cluster 命令取得後設定儲存於 /etc/kubernetes/pki/ca.crt
  • --apiserver-endpoint: EKS cluster API Server endpoint。與 --b64-cluster-ca 一樣皆是透過 aws eks describe-cluster 命令,並設定 kubelet 所使用的 kubeconfig 文件(/var/lib/kubelet/kubeconfig )。
  • --dns-cluster-ip:設定 EKS cluster 內部使用的 DNS IP 地址於 kubelet 設定文件  /etc/kubernetes/kubelet/kubelet-config.json,正是預設 CoreDNS(kube-dns)Service Cluster IP。預設使用 10.100.0.10,若使用 CIDR 10. 為 prefix 則會使用 172.20.0.10[14]
$ kubectl -n kube-system get svc
NAME       TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
kube-dns   ClusterIP   10.100.0.10   <none>        53/UDP,53/TCP   4d5h
  • --use-max-pods:設置  kubelet 參數 --max-pods 於 kubelet 設定文件 /etc/kubernetes/kubelet/kubelet-config.json

總結

以上是 EKS worker node 啟用前在預設 EKS optimized Amazon Linux AMI 內預設 script 及針對 kubelet 參數設定,在經過初步了解之後,下一篇我們將會探討 kubelet TLS bootstrapping 啟動過程,來了解實際自動加入 cluster 過程 kubelet 及 control plane 所需設置。


上述資訊透過 EKS 所提供 Logs 來驗證上游 Kubernetes 運作原理,倘若上述內文有所錯誤,隨時可以留言或是私訊我。

參考文件

  1. Kubernetes Components - https://kubernetes.io/docs/concepts/overview/components/#node-components
  2. Provisioning a CA and Generating TLS Certificates | Kubernetes The Hard Way - https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md
  3. kubeadm join - https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-join/
  4. Managing nodegroups | eksctl - https://eksctl.io/usage/managing-nodegroups/
  5. Amazon EKS nodes - https://docs.aws.amazon.com/eks/latest/userguide/eks-compute.html
  6. Managed node groups - https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html
  7. Self-managed nodes - https://docs.aws.amazon.com/eks/latest/userguide/worker.html
  8. aws eks describe-nodegroup - https://docs.aws.amazon.com/cli/latest/reference/eks/describe-nodegroup.html
  9. Amazon EKS optimized Amazon Linux AMIs - https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html
  10. aws ec2 describe-instance-attribute - https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-instance-attribute.html
  11. Run commands on your Linux instance at launch - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
  12. Amazon EKS optimized Amazon Linux AMI build script - https://docs.aws.amazon.com/eks/latest/userguide/eks-ami-build-scripts.html
  13. Amazon EKS AMI Build Specification - https://github.com/awslabs/amazon-eks-ami
  14. https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh#L462-L485
  15. TLS bootstrapping - https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/