為什麼 EKS cluster 知道預設 CNI Plugin 為 Amazon VPC CNI plugin

預設 EKS cluster 使用 VPC CNI Plugin 作為 CNI。本文將探討「為什麼 EKS cluster 知道預設 CNI Plugin 為 VPC CNI plugin」,希望理解 EKS CNI plugin 設定過程。

為什麼 EKS cluster 知道預設 CNI Plugin 為 Amazon VPC CNI plugin
Photo by Kari Shea / Unsplash

預設 EKS cluster 使用 Amazon VPC Container Network Interface(CNI)Plugin 作為 CNI。EKS 集群建置後,我們即可以透過 kubectl 命令查看 DaemonSet aws-node 而無需手動安裝 CNI plugin。

$ kubectl -n kube-system get ds
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
aws-node     3         3         3       3            3           <none>          5d11h
kube-proxy   3         3         3       3            3           <none>          5d11h

本文將探討「為什麼 EKS cluster 知道預設 CNI Plugin 為 Amazon VPC CNI plugin」,希望理解 EKS CNI plugin 設定過程。

work node

與上一篇查看 kubelet systemd unit 設定檔時,我們其實已經看過 kubelet 啟用 --network-plugin=cni。kubelet 將會讀取 --network-plugin-dir 目錄設定檔,並使用 CNI 設定檔設置每一個 Pod network。

  • kubelet systemd unit
[ec2-user@ip-192-168-65-212 ~]$ systemctl cat kubelet
# /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=docker.service iptables-restore.service
Requires=docker.service

[Service]
ExecStartPre=/sbin/iptables -P FORWARD ACCEPT -w 5
ExecStart=/usr/bin/kubelet --cloud-provider aws \
    --config /etc/kubernetes/kubelet/kubelet-config.json \
    --kubeconfig /var/lib/kubelet/kubeconfig \
    --container-runtime docker \
    --network-plugin cni $KUBELET_ARGS $KUBELET_EXTRA_ARGS
...
... 
...
  • 登入 EKS worker node 檢視 kubelet 確認預設 CNI plugin 參數 [2]:
  • --cni-bin-dir="/opt/cni/bin":kubelet 啟用時會查看此目錄
  • --cni-conf-dir="/etc/cni/net.d"
[ec2-user@ip-192-168-65-212 ~]$ journalctl -u kubelet | grep -e "--cni"
Sep 19 09:46:56 ip-192-168-65-212.eu-west-1.compute.internal kubelet[3347]: I0919 09:46:56.765003    3347 flags.go:59] FLAG: --cni-bin-dir="/opt/cni/bin"
Sep 19 09:46:56 ip-192-168-65-212.eu-west-1.compute.internal kubelet[3347]: I0919 09:46:56.765008    3347 flags.go:59] FLAG: --cni-cache-dir="/var/lib/cni/cache"
Sep 19 09:46:56 ip-192-168-65-212.eu-west-1.compute.internal kubelet[3347]: I0919 09:46:56.765014    3347 flags.go:59] FLAG: --cni-conf-dir="/etc/cni/net.d"

不免俗的來驗證一下,我們可以直接透過 EC2 啟用 Amazon EKS optimized Amazon Linux AMI 來比對:

透過 EC2 啟用 EKS AMI 的 node

  • 使用 AMI:amazon/amazon-eks-node-1.22-v20220824
[ec2-user@ip-172-31-44-117 ~]$ ls /opt/cni/bin
bandwidth  bridge  dhcp  firewall  flannel  host-device  host-local  ipvlan  loopback  macvlan  portmap  ptp  sbr  static  tuning  vlan

[ec2-user@ip-172-31-44-117 ~]$ ls -al /etc/cni/net.d/
ls: cannot access /etc/cni/net.d/: No such file or directory

已經有 VPC CNI plugin Pod 的 node

[ec2-user@ip-192-168-65-212 ~]$ ls /opt/cni/bin
aws-cni  aws-cni-support.sh  bandwidth  bridge  dhcp  egress-v4-cni  firewall  flannel  host-device  host-local  ipvlan  loopback  macvlan  portmap  ptp  sbr  static  tuning  vlan
[ec2-user@ip-192-168-65-212 ~]$ sudo cat /etc/cni/net.d/10-aws.conflist
{
  "cniVersion": "0.3.1",
  "name": "aws-cni",
  "plugins": [
    {
      "name": "aws-cni",
      "type": "aws-cni",
      "vethPrefix": "eni",
      "mtu": "9001",
      "pluginLogFile": "/var/log/aws-routed-eni/plugin.log",
      "pluginLogLevel": "DEBUG"
    },
    {
      "name": "egress-v4-cni",
      "type": "egress-v4-cni",
      "mtu": 9001,
      "enabled": "false",
      "nodeIP": "192.168.65.212",
      "ipam": {
         "type": "host-local",
         "ranges": [[{"subnet": "169.254.172.0/22"}]],
         "routes": [{"dst": "0.0.0.0/0"}],
         "dataDir": "/run/cni/v6pd/egress-v4-ipam"
      },
      "pluginLogFile": "/var/log/aws-routed-eni/egress-v4-plugin.log",
      "pluginLogLevel": "DEBUG"
    },
    {
      "type": "portmap",
      "capabilities": {"portMappings": true},
      "snat": true
    }
  ]
}

/opt/cni/bin 路徑正如同 CNI 參數名稱為 CNI binary 路徑。預設 EKS 在製作 worker node AMI 時,透過 script [3] 安裝 常見 CNI plugin binary [4]。上述比對可以了解,在 aws-node Pod 尚未執行之前,aws-cni binary 及設定檔 /etc/cni/net.d/10-aws.conflist 皆尚未被安裝或建立。

接續,我們也可以透過 kubectl 檢視 aws-node Pod logs,可以看到由 Amazon VPC CNI plugin 預設使用的 initContainers amazon-k8s-cni-init 安裝 loopbackportmapbandwidthaws-cni-support.sh plugin。init container 執行結束後,aws-node 也有執行複製設定檔及 binary。以下為 logs:

$ kubectl -n kube-system logs aws-node-5c6w5 --all-containers --timestamps
2022-09-19T09:48:57.333672597Z Copying CNI plugin binaries ...
2022-09-19T09:48:57.333810096Z + PLUGIN_BINS='loopback portmap bandwidth aws-cni-support.sh'
2022-09-19T09:48:57.333831182Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.333835571Z + '[' '!' -f loopback ']'
2022-09-19T09:48:57.333838091Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.333840648Z + '[' '!' -f portmap ']'
2022-09-19T09:48:57.333843085Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.333845490Z + '[' '!' -f bandwidth ']'
2022-09-19T09:48:57.333847874Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.333850503Z + '[' '!' -f aws-cni-support.sh ']'
2022-09-19T09:48:57.333908878Z + HOST_CNI_BIN_PATH=/host/opt/cni/bin
2022-09-19T09:48:57.333911360Z + echo 'Copying CNI plugin binaries ... '
2022-09-19T09:48:57.333914044Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.333916695Z + install loopback /host/opt/cni/bin
2022-09-19T09:48:57.358632542Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.358667213Z + install portmap /host/opt/cni/bin
2022-09-19T09:48:57.371180218Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.371397188Z + install bandwidth /host/opt/cni/bin
2022-09-19T09:48:57.386902686Z + for b in '$PLUGIN_BINS'
2022-09-19T09:48:57.387089749Z + install aws-cni-support.sh /host/opt/cni/bin
2022-09-19T09:48:57.389252423Z + echo 'Configure rp_filter loose... '
2022-09-19T09:48:57.389584940Z Configure rp_filter loose...
2022-09-19T09:48:57.389945375Z ++ get_metadata local-ipv4
2022-09-19T09:48:57.391264947Z +++ curl -X PUT http://169.254.169.254/latest/api/token -H 'X-aws-ec2-metadata-token-ttl-seconds: 60'
2022-09-19T09:48:57.473320665Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2022-09-19T09:48:57.473705220Z                                  Dload  Upload   Total   Spent    Left  Speed
100    56  100    56    0     0  28000      0 --:--:-- --:--:-- --:--:-- 56000
2022-09-19T09:48:57.478526895Z ++ TOKEN=AQAEAJKJMlzdfIbRn8BQh4U5I2i0dMhKrW4DxRx64XuuP_g4rPzuOw==
2022-09-19T09:48:57.478907025Z ++ attempts=60
2022-09-19T09:48:57.478914505Z ++ false
2022-09-19T09:48:57.478917627Z ++ '[' 1 -gt 0 ']'
2022-09-19T09:48:57.478920837Z ++ '[' 60 -eq 0 ']'
2022-09-19T09:48:57.479064447Z +++ curl -H 'X-aws-ec2-metadata-token: AQAEAJKJMlzdfIbRn8BQh4U5I2i0dMhKrW4DxRx64XuuP_g4rPzuOw==' http://169.254.169.254/latest/meta-data/local-ipv4
2022-09-19T09:48:57.486678298Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2022-09-19T09:48:57.486692175Z                                  Dload  Upload   Total   Spent    Left  Speed
100    14  100    14    0     0  14000      0 --:--:-- --:--:-- --:--:-- 14000
2022-09-19T09:48:57.488090438Z ++ meta=192.168.65.212
2022-09-19T09:48:57.488210992Z ++ '[' 0 -gt 0 ']'
2022-09-19T09:48:57.488331224Z ++ '[' 0 -gt 0 ']'
2022-09-19T09:48:57.488430469Z ++ echo 192.168.65.212
2022-09-19T09:48:57.488788529Z + HOST_IP=192.168.65.212
2022-09-19T09:48:57.489299361Z ++ get_metadata mac
2022-09-19T09:48:57.489716539Z +++ curl -X PUT http://169.254.169.254/latest/api/token -H 'X-aws-ec2-metadata-token-ttl-seconds: 60'
2022-09-19T09:48:57.497186993Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2022-09-19T09:48:57.497200745Z                                  Dload  Upload   Total   Spent    Left  Speed
100    56  100    56    0     0  56000      0 --:--:-- --:--:-- --:--:-- 56000
2022-09-19T09:48:57.499248303Z ++ TOKEN=AQAEAJKJMlxeFjZlUL_0INYoCXkf7UWmVm4nIKV6nnDeG_VvdZ9-Ig==
2022-09-19T09:48:57.499358341Z ++ attempts=60
2022-09-19T09:48:57.499440488Z ++ false
2022-09-19T09:48:57.499594430Z ++ '[' 1 -gt 0 ']'
2022-09-19T09:48:57.499599969Z ++ '[' 60 -eq 0 ']'
2022-09-19T09:48:57.500015110Z +++ curl -H 'X-aws-ec2-metadata-token: AQAEAJKJMlxeFjZlUL_0INYoCXkf7UWmVm4nIKV6nnDeG_VvdZ9-Ig==' http://169.254.169.254/latest/meta-data/mac
2022-09-19T09:48:57.511676682Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2022-09-19T09:48:57.511709736Z                                  Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0  17000      0 --:--:-- --:--:-- --:--:-- 17000
2022-09-19T09:48:57.515247164Z ++ meta=06:1b:3e:40:af:fd
2022-09-19T09:48:57.515648955Z ++ '[' 0 -gt 0 ']'
2022-09-19T09:48:57.515656468Z ++ '[' 0 -gt 0 ']'
2022-09-19T09:48:57.515659528Z ++ echo 06:1b:3e:40:af:fd
2022-09-19T09:48:57.515827055Z + PRIMARY_MAC=06:1b:3e:40:af:fd
2022-09-19T09:48:57.516696407Z ++ grep -F 'link/ether 06:1b:3e:40:af:fd'
2022-09-19T09:48:57.516908826Z ++ awk '-F[ :]+' '{print $2}'
2022-09-19T09:48:57.517104498Z ++ ip -o link show
2022-09-19T09:48:57.529715651Z + PRIMARY_IF=eth0
2022-09-19T09:48:57.529881853Z + sysctl -w net.ipv4.conf.eth0.rp_filter=2
2022-09-19T09:48:57.549511061Z net.ipv4.conf.eth0.rp_filter = 2
2022-09-19T09:48:57.550035252Z + cat /proc/sys/net/ipv4/conf/eth0/rp_filter
2022-09-19T09:48:57.552013538Z 2
2022-09-19T09:48:57.552296409Z + '[' false == true ']'
2022-09-19T09:48:57.552638323Z + sysctl -e -w net.ipv4.tcp_early_demux=1
2022-09-19T09:48:57.557945154Z net.ipv4.tcp_early_demux = 1
2022-09-19T09:48:57.559523168Z + '[' false == true ']'
2022-09-19T09:48:57.560537081Z + echo 'CNI init container done'
2022-09-19T09:48:57.560663349Z CNI init container done
2022-09-19T09:48:58.240326675Z {"level":"info","ts":"2022-09-19T09:48:58.237Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
2022-09-19T09:48:58.241282124Z {"level":"info","ts":"2022-09-19T09:48:58.240Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
2022-09-19T09:48:58.316896574Z {"level":"info","ts":"2022-09-19T09:48:58.316Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
2022-09-19T09:48:58.318121492Z {"level":"info","ts":"2022-09-19T09:48:58.317Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
2022-09-19T09:49:00.401119692Z {"level":"info","ts":"2022-09-19T09:49:00.397Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
2022-09-19T09:49:00.441859850Z {"level":"info","ts":"2022-09-19T09:49:00.441Z","caller":"entrypoint.sh","msg":"Copying config file ... "}
2022-09-19T09:49:00.450179414Z {"level":"info","ts":"2022-09-19T09:49:00.449Z","caller":"entrypoint.sh","msg":"Successfully copied CNI plugin binary and config file."}
2022-09-19T09:49:00.451045960Z {"level":"info","ts":"2022-09-19T09:49:00.450Z","caller":"entrypoint.sh","msg":"Foregrounding IPAM daemon ..."}

Amazon VPC CNI plugin entrypoint [5],可以確認其 aws-cni binary 為自身啟用時會複製至本地端,同時也能查看到 volumeMount mount 相應目錄。

$ kubectl -n kube-system get ds aws-node -o yaml
...
...
		image: 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon-k8s-cni:v1.10.1-eksbuild.1
...
...
        volumeMounts:
        - mountPath: /host/opt/cni/bin
          name: cni-bin-dir
        - mountPath: /host/etc/cni/net.d
          name: cni-net-dir
        - mountPath: /host/var/log/aws-routed-eni
          name: log-dir
        - mountPath: /var/run/aws-node
          name: run-dir
        - mountPath: /var/run/dockershim.sock
          name: dockershim
        - mountPath: /run/xtables.lock
          name: xtables-lock
...
...
      volumes:
      - hostPath:
          path: /opt/cni/bin
          type: ""
        name: cni-bin-dir
      - hostPath:
          path: /etc/cni/net.d
          type: ""
        name: cni-net-dir
      - hostPath:
          path: /var/run/dockershim.sock
          type: ""
        name: dockershim
      - hostPath:
          path: /run/xtables.lock
          type: ""
        name: xtables-lock
      - hostPath:
          path: /var/log/aws-routed-eni
          type: DirectoryOrCreate
        name: log-dir
      - hostPath:
          path: /var/run/aws-node
          type: DirectoryOrCreate
        name: run-dir

那為什麼需要這樣設定一個 entrypoint script 安裝 CNI Binary 呢?

VPC CNI plugin 主要有兩個元件:

  • CNI Plugin:主要整合 host 及 Pod 網路設定。
  • ipamd:長期執行於節點本地端的 IP Address Management (IPAM) daemon 主要負責以下兩件事:
  • 維持一個可用的 IP 地址 warm-pool
  • 指派 IP 地址給 Pod

在  Amazon VPC CNI plugin entrypoint [5] 的註解裡解釋:一般來說,kubelet 在查看 well-known directory(這邊指的是 /opt/cni/bin/etc/cni/net.d)時,就會將 CNI plugin 視為 Ready 狀態。然而因為 VPC CNI plugin 提供了上述兩個元件,希望在成功啟動 IPAM daemon 可以確保連接上 Kubernetes 及本地端 EC2 metadata service,並將 aws-cni binary 複製到 well-known directory。

API server

透過 CloudWatch Logs insight syntax 檢視 kube-apiserver-audit logs,並且過濾 verb 為 create。

filter @logStream like /^kube-apiserver-audit/
 | fields @timestamp, @message
 | sort @timestamp asc
 | filter objectRef.name == 'aws-node' AND objectRef.resource == 'daemonsets' AND verb == 'create'
 | limit 10000

能發現 username 為 eks: cluster-bootstrap 透過 kubectl 方式部署 create 了 aws-node DaemonSet。

  • userAgent: kubectl/v1.22.12 (linux/amd64) kubernetes/dade57b
  • user.username: eks: cluster-bootstrap

總結

由上述驗證後,我們可以了解 EKS 在 cluster bootstrap 過程中透過 kubectl 方式部署 VPC CNI plugin。而在 worker node 層級,設定了 kubelet CNI 相關參數,並由 VPC CNI plugin 複製 binary 檔案至主機上使用。


上述資訊透過 EKS 所提供 Logs 來驗證上游 Kubernetes 運作原理,倘若上述內文有所錯誤,隨時可以留言或是私訊我。

參考文件

  1. Pod networking in Amazon EKS using the Amazon VPC CNI plugin for Kubernetes - https://docs.aws.amazon.com/eks/latest/userguide/pod-networking.html
  2. Network Plugins | Kubernetes 1.22 - https://v1-22.docs.kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#cni
  3. https://github.com/awslabs/amazon-eks-ami/blob/master/scripts/install-worker.sh#L240
  4. The Container Network Interface - https://www.cni.dev/plugins/current/
  5. https://github.com/aws/amazon-vpc-cni-k8s/blob/master/scripts/entrypoint.sh