Last updated: 15 May 26 12:21:44 (Asia/Shanghai)

初始化 k8s

注意:因为 docker 一直不符合 CRI 规范,K8S 从 1.24 开始彻底不支持 docker,也不提供 dockershim.sock 了
因此如果用的版本较高,需要使用 cri-dockerd,也可以用 containerd 来作为运行时
可以从这里了解: https://github.com/Mirantis/cri-dockerd

官网地址: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
实践中,会遇到一些问题,不过大多数都是 google、k8s 域名无法访问的问题

kubeadm 安装步骤:
先按照官网的来:

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

之后要改用国内镜像:

curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
echo 'deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main' >>/etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
echo 'deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main' >>/etc/apt/sources.list.d/kubernetes.list
sudo apt-get update

注意:
此步骤有可能报错:The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
这样解决:

sudo gpg --keyserver keyserver.ubuntu.com --recv FEEA9169307EA071
sudo gpg --export --armor FEEA9169307EA071 | sudo apt-key add -
sudo apt-get update
sudo gpg --keyserver keyserver.ubuntu.com --recv FEEA9169307EA071
sudo gpg --export --armor FEEA9169307EA071 | sudo apt-key add -
sudo apt-get update

这样就可以更新成功

以上步骤完成后,便可以安装:

sudo apt-get install -y kubelet kubeadm kubectl --allow-unauthenticated
sudo apt-mark hold kubelet kubeadm kubectl
sudo apt-get install -y kubelet kubeadm kubectl --allow-unauthenticated
sudo apt-mark hold kubelet kubeadm kubectl

启动步骤:
初始化前,可以执行预检查:

sudo kubeadm init  phase preflight
sudo kubeadm init  phase preflight

这个步骤会尝试拉取镜像,会去 registry.k8s.io 这个域名拉取镜像,但是国内很难访问,会导致预检特别慢
如果卡住可以不用等待完成,直接 Ctrl + C 强制停止

列出需要拉取的镜像:

sudo kubeadm config images list
sudo kubeadm config images list

拉取镜像的时候,记得指定国内的仓库域名:

sudo kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
sudo kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers

预检中有可能遇到这个问题: [ERROR CRI]: container runtime is not running
参考地址: https://github.com/containerd/containerd/issues/4581
一般可以这样解决:

sudo rm -rf /etc/containerd/config.toml
sudo service containerd restart
sudo rm -rf /etc/containerd/config.toml
sudo service containerd restart

开始部署节点:
使用这个指令打印并输出 kubeadm 默认的安装配置:

sudo kubeadm config print init-defaults > kubeadm.yaml
sudo kubeadm config print init-defaults > kubeadm.yaml

内容如下:
(有疑问可以看: https://v1-23.docs.kubernetes.io/zh/docs/reference/config-api/kubeadm-config.v1beta3/)

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef # 建议修改
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4 # 改成机器的内网 IP
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock 
  # 上面这行改成 dockershim.sock 或者新版的 unix:///var/run/cri-dockerd.sock
  imagePullPolicy: IfNotPresent
  name: node
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io # 改成 registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef # 建议修改
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4 # 改成机器的内网 IP
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock 
  # 上面这行改成 dockershim.sock 或者新版的 unix:///var/run/cri-dockerd.sock
  imagePullPolicy: IfNotPresent
  name: node
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io # 改成 registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}

注意:如果 K8S 的版本超过 1.24,那么需要安装 cri-dockerd 才能支持 docker,请参考 初始化 cri-dockerd

对这个配置进行定制修改,修改好后执行:

sudo kubeadm init --config kubeadm.yaml
sudo kubeadm init --config kubeadm.yaml

如果初始化失败,可以回退:

sudo kubeadm reset
sudo kubeadm reset

初始化成功后,会打印一系列操作,建议执行一遍,后面还会打印出加入节点的令牌,如果日后忘记了,可以重新打印:

sudo kubeadm token create --print-join-command
sudo kubeadm token create --print-join-command

这里记录一下初始化成功,kubeadmin 提醒我们要做的事情:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

注意,这个操作只对当前用户生效,也就是说必须当前用户有权限执行 kubectl 才能有用
如果你必须 sudo kubectl,请切换成 root 用户再做上面的操作

如果不做这个操作,后续 kubectl apply 之类的操作会报错

初始化完成后,可以尝试指令:

sudo kubectl get nodes
sudo kubectl get nodes

刚安装好时,状态为 NotReady,因为网络插件没有安装

详细查看节点信息:

sudo kubectl describe node master
sudo kubectl describe node master

可以发现有几个处在 Pending 状态

后续,还要安装网络插件:

curl https://docs.projectcalico.org/manifests/calico.yaml -O
sudo kubectl apply -f calico.yaml
curl https://docs.projectcalico.org/manifests/calico.yaml -O
sudo kubectl apply -f calico.yaml

这样就可以全部跑起来了