ubuntu2204使用kubeadm安装k8s1.26高可用集群

k8s环境规划:
podSubnet(pod网段) 10.244.0.0/16
serviceSubnet(service网段): 10.96.0.0/12

实验环境规划:
操作系统:ubuntu2004
配置: 4Gib内存/4vCPU/60G硬盘
网络:br桥接

角色 IP 主机名 组件
master 10.168.1.41 master1 apiserver、controller-manager、schedule、kubelet、etcd、kube-proxy、容器运行时、calico、keepalived、nginx
master 10.168.1.42 master2 apiserver、controller-manager、schedule、kubelet、etcd、kube-proxy、容器运行时、calico、keepalived、nginx
master 10.168.1.44 master3 apiserver、controller-manager、schedule、kubelet、etcd、kube-proxy、容器运行时、calico、keepalived、nginx
worker 10.168.1.44 node1 Kube-proxy、calico、coredns、容器运行时、kubelet
worker 10.168.1.45 node2 Kube-proxy、calico、coredns、容器运行时、kubelet
VIP 10.168.1.40

一、初始化安装k8s集群的实验环境

1.1 配置主机名

在10.168.1.41上执行如下:

hostnamectl set-hostname master1 && bash

在0.168.1.42上执行如下:

hostnamectl set-hostname omaster2 && bash

在10.168.1.43上执行如下:

hostnamectl set-hostname master3 && bash

在0.168.1.44上执行如下:

hostnamectl set-hostname node1 && bash

在0.168.1.45上执行如下:

hostnamectl set-hostname node2 && bash

1.2 配置主机hosts文件,相互之间通过主机名互相访问

修改机器的/etc/hosts文件,文件最后增加如下内容:

10.168.1.41 master1
10.168.1.42 master2
10.168.1.43 master3
10.168.1.44 node1
10.168.1.45 node2

修改之后的文件如下:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.168.1.41 master1
10.168.1.42 master2
10.168.1.43 master3
10.168.1.44 node1
10.168.1.45 node

1.3 配置主机之间无密码登录

scp -r /etc/hosts master01:/etc/hosts
scp -r /etc/hosts master02:/etc/hosts
scp -r /etc/hosts master03:/etc/hosts
scp -r /etc/hosts node01:/etc/hosts
scp -r /etc/hosts node02:/etc/hosts

1.1.6 关闭交换分区swap,提升性能

#临时关闭

for i in 10.168.1.{61..65};do ssh $i swapoff -a ;done

#永久关闭:注释swap挂载,给swap这行开头加一下注释

for i in 10.168.1.{61..65};do ssh $i "sed -i 's/.*swap.*/#&/g' /etc/fstab";done

问题1:为什么要关闭swap交换分区?
Swap是交换分区,如果机器内存不够,会使用swap分区,但是swap分区的性能较低,k8s设计的时候为了能提升性能,默认是不允许使用交换分区的。Kubeadm初始化的时候会检测swap是否关闭,如果没关闭,那就初始化失败。如果不想要关闭交换分区,安装k8s的时候可以指定–ignore-preflight-errors=Swap来解决。

1.1.7 修改机器内核参数

for i in 10.168.1.{61..65};do ssh $i modprobe br_netfilter;done
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
for i in 10.168.1.{61..65};do scp -r /etc/sysctl.d/k8s.conf $i:/etc/sysctl.d/k8s.conf;done
for i in 10.168.1.{61..65};do ssh $i sysctl -p /etc/sysctl.d/k8s.conf;done

问题1:sysctl是做什么的?
在运行时配置内核参数
-p 从指定的文件加载系统参数,如不指定即从/etc/sysctl.conf中加载

问题2:为什么要执行modprobe br_netfilter?
修改/etc/sysctl.d/k8s.conf文件,增加如下三行参数:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

sysctl -p /etc/sysctl.d/k8s.conf出现报错:

sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory

解决方法:
modprobe br_netfilter

问题3:为什么开启net.bridge.bridge-nf-call-iptables内核参数?
在centos下安装docker,执行docker info出现如下警告:
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

解决办法:
vim /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

问题4:为什么要开启net.ipv4.ip_forward = 1参数?
kubeadm初始化k8s如果报错:

就表示没有开启ip_forward,需要开启。

net.ipv4.ip_forward是数据包转发:
出于安全考虑,Linux系统默认是禁止数据包转发的。所谓转发即当主机拥有多于一块的网卡时,其中一块收到数据包,根据数据包的目的ip地址将数据包发往本机另一块网卡,该网卡根据路由表继续发送数据包。这通常是路由器所要实现的功能。
要让Linux系统具有路由转发功能,需要配置一个Linux的内核参数net.ipv4.ip_forward。这个参数指定了Linux系统当前对路由转发功能的支持情况;其值为0时表示禁止进行IP转发;如果是1,则说明IP转发功能已经打开。

1.1.9 配置阿里云的repo源
首先导入 gpg key:

sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg

新建 /etc/apt/sources.list.d/kubernetes.list,内容为

echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://mirrors.tuna.tsinghua.edu.cn/kubernetes/apt kubernetes-xenial main" >>/etc/apt/sources.list.d/kubernetes.list
for i in 10.168.1.{61..65};do scp -r /etc/apt/sources.list.d/kubernetes.list root@$i:/etc/apt/sources.list.d/kubernetes.list;done
for i in 10.168.1.{61..65};do ssh $i apt update;done

1.1.11 配置时间同步

for i in 10.168.1.{61..65};do ssh $i  sudo apt install ntpdate cron -y;crontab -l > /tmp/crontab.$$ ; echo '01 */3 * * * sudo /usr/sbin/ntpdate cn.pool.ntp.org >> /dev/null 2>&1' >> /tmp/crontab.$$ ; crontab /tmp/crontab.$$;crontab -l;sudo /usr/sbin/ntpdate cn.pool.ntp.org;done

1.1.12 安装基础软件包

for i in 10.168.1.{61..65};do ssh $i apt install -y net-tools nfs-common unzip vim ipvsadm ipset;done

1.2、安装containerd服务
1.2.1 安装containerd
如果你过去安装过 docker,先删掉:

for i in 10.168.1.{61..65};do ssh $i sudo apt-get remove docker docker-engine docker.io containerd runc

首先安装依赖:

for i in 10.168.1.{61..65};do ssh $i sudo apt-get install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common;done

信任 Docker 的 GPG 公钥:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
for i in 10.168.1.{61..65};do ssh $i curl -fsSL https://download.docker.com/linux/ubuntu/gpg| sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg;done

添加软件仓库:

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
for i in 10.168.1.{61..65};do scp -r /etc/apt/sources.list.d/docker.list root@$i:/etc/apt/sources.list.d/docker.list;done

最后安装

for i in 10.168.1.{61..65};do ssh $i sudo apt-get update;done
for i in 10.168.1.{61..65};do ssh $i sudo apt-get install containerd.io -y;done
for i in 10.168.1.{61..65};do ssh $i mkdir -p /etc/containerd;done
for i in 10.168.1.{61..65};do ssh $i containerd config default > /etc/containerd/config.toml;done

修改配置文件:

for i in 10.168.1.{61..65};do ssh $i sed -i '/SystemdCgroup/s/false/true/g' /etc/containerd/config.toml;done
sed -i '/sandbox_image/s#"registry.k8s.io/pause:3.6"#"registry.aliyuncs.com/google_containers/pause:3.7"#g' /etc/containerd/config.toml
for i in 10.168.1.{61..65};do scp -r /etc/containerd/config.toml root@$i:/etc/containerd/config.toml;done

配置 containerd 开机启动,并启动 containerd

for i in 10.168.1.{61..65};do ssh $i systemctl enable containerd  --now;done
for i in 10.168.1.{61..65};do ssh $i systemctl start containerd;done
for i in 10.168.1.{61..65};do ssh $i systemctl status containerd;done

修改/etc/crictl.yaml文件,修改/etc/crictl.yaml文件这部分在视频里没讲,但是大家做实验要按照文档执行:

cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
for i in 10.168.1.{61..65};do scp -r /etc/crictl.yaml root@$i:/etc/crictl.yaml;done
for i in 10.168.1.{61..65};do ssh $i systemctl restart  containerd;done
for i in 10.168.1.{61..65};do ssh $i systemctl status containerd;done

配置containerd镜像加速器,k8s所有节点均按照以下配置:
编辑vim /etc/containerd/config.toml文件

sed -i '/config_path/s#""#"/etc/containerd/certs.d"#g' /etc/containerd/config.toml
for i in 10.168.1.{61..65};do scp -r /etc/containerd/config.toml root@$i:/etc/containerd/config.toml;done
for i in 10.168.1.{61..65};do ssh $i mkdir /etc/containerd/certs.d/docker.io/ -p;done
cat >>/etc/containerd/certs.d/docker.io/hosts.toml<<EOF
[host."https://vh3bm52y.mirror.aliyuncs.com",host."https://registry.docker-cn.com"]
  capabilities = ["pull"]
EOF
for i in 10.168.1.{61..65};do scp -r /etc/containerd/certs.d/docker.io/hosts.toml root@$i:/etc/containerd/certs.d/docker.io/hosts.toml;done

重启containerd:

for i in 10.168.1.{61..65};do ssh $i systemctl restart containerd;done

1.3、安装初始化k8s需要的软件包

for i in 10.168.1.{61..65};do ssh $i apt install -y kubelet kubeadm kubectl;done
for i in 10.168.1.{61..65};do ssh $i systemctl enable kubelet;done

注:每个软件包的作用
Kubeadm: kubeadm是一个工具,用来初始化k8s集群的
kubelet: 安装在集群所有节点上,用于启动Pod的,kubeadm安装k8s,k8s控制节点和工作节点的组件,都是基于pod运行的,只要pod启动,就需要kubelet

kubectl: 通过kubectl可以部署和管理应用,查看各种资源,创建、删除和更新各种组件

for i in 10.168.1.{61..63};do ssh $i apt install nginx keepalived libnginx-mod-stream;done

2、修改nginx配置文件。主备一样

cat >/etc/nginx/nginx.conf<<EOF
user root;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
load_module /usr/lib/nginx/modules/ngx_stream_module.so;
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

# 四层负载均衡,为两台Master apiserver组件提供负载均衡
stream {

    log_format  main  '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';

    access_log  /var/log/nginx/k8s-access.log  main;

    upstream k8s-apiserver {
            server 10.168.1.61:6443 weight=5 max_fails=3 fail_timeout=30s;
            server 10.168.1.62:6443 weight=5 max_fails=3 fail_timeout=30s;
            server 10.168.1.63:6443 weight=5 max_fails=3 fail_timeout=30s;

    }
    server {
       listen 16443; # 由于nginx与master节点复用,这个监听端口不能是6443,否则会冲突
       proxy_pass k8s-apiserver;
    }
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    server {
        listen       80 default_server;
        server_name  _;

        location / {
        }
    }
}
EOF
for i in 10.168.1.{61..63};do scp -r /etc/nginx/nginx.conf $i:/etc/nginx/nginx.conf;done

3、keepalive配置
主keepalived

cat >/etc/keepalived/keepalived.conf<<EOF
global_defs { 
   notification_email { 
     acassen@firewall.loc 
     failover@firewall.loc 
     sysadmin@firewall.loc 
   } 
   notification_email_from Alexandre.Cassen@firewall.loc  
   smtp_server 127.0.0.1 
   smtp_connect_timeout 30 
   router_id NGINX_MASTER
} 

vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}

vrrp_instance VI_1 { 
    state MASTER 
    interface enp1s0  # 修改为实际网卡名
    virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的 
    priority 100    # 优先级,备服务器设置 90 
    advert_int 1    # 指定VRRP 心跳包通告间隔时间,默认1秒 
    authentication { 
        auth_type PASS      
        auth_pass 1111 
    }  
    # 虚拟IP
    virtual_ipaddress { 
        10.168.1.60/24
    } 
    track_script {
        check_nginx
    } 
}

#vrrp_script:指定检查nginx工作状态脚本(根据nginx状态判断是否故障转移)
#virtual_ipaddress:虚拟IP(VIP)
EOF
for i in 10.168.1.{61..63};do scp -r /etc/keepalived/keepalived.conf $i:/etc/keepalived/keepalived.conf;done
cat>/etc/keepalived/check_nginx.sh<<EOF
#!/bin/bash
#1、判断Nginx是否存活
counter=$(ps -ef |grep nginx | grep sbin | egrep -cv "grep|$$" )
if [ $counter -eq 0 ]; then
    #2、如果不存活则尝试启动Nginx
    service nginx start
    sleep 2
    #3、等待2秒后再次获取一次Nginx状态
    counter=1
    #4、再次进行判断,如Nginx还不存活则停止Keepalived,让地址进行漂移
    if [$counter  -eq 0 ]; then
        service  keepalived stop
    fi
fi
EOF
for i in 10.168.1.{61..63};do scp -r /etc/keepalived/check_nginx.sh $i:/etc/keepalived/check_nginx.sh;done
for i in 10.168.1.{61..63};do ssh $i chmod +x  /etc/keepalived/check_nginx.sh;done
for i in 10.168.1.{61..63};do ssh $i systemctl daemon-reload&&systemctl start nginx&&systemctl start keepalived&&systemctl enable nginx&&systemctl status keepalived ;done

1.5、kubeadm初始化k8s集群

#设置容器运行时

for i in 10.168.1.{61..65};do ssh $i crictl config runtime-endpoint /run/containerd/containerd.sock;done
cat >/etc/modules-load.d/ipvs.modules<<EOF
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
EOF

chmod 755 /etc/modules-load.d/ipvs.modules
bash /etc/modules-load.d/ipvs.modules
lsmod |grep -e ip_vs -e nf_conntrack_ipv4

#使用kubeadm初始化k8s集群

kubeadm config print init-defaults > kubeadm.yaml

根据我们自己的需求修改配置,比如修改 imageRepository 的值,kube-proxy 的模式为 ipvs,需要注意的是由于我们使用的containerd作为运行时,所以在初始化节点的时候需要指定cgroupDriver为systemd

kubeadm.yaml配置文件如下:

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
#localAPIEndpoint  #前面加注释
#advertiseAddress   #前面加注释
#bindPort         #前面加注释
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock  #指定containerd容器运行时
  imagePullPolicy: IfNotPresent
  #name: node  #前面加注释
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers 
#指定阿里云镜像仓库
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
#新增加如下内容:
controlPlaneEndpoint: 10.168.1.60:16443
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16 #指定pod网段
  serviceSubnet: 10.96.0.0/12
scheduler: {}
#追加如下内容
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
kubeadm reset
rm -rf /etc/kubernetes/
rm -rf .kube/
rm -rf /var/lib/etcd/
ipvsadm --clear
iptables -F
iptables -X

备注:k8s_1.25.0.tar.gz这个文件如何来的?
这个文件把安装k8s需要的镜像都继承好了,这个是我第一次安装1.25.0这个版本,获取到对应的镜像,通过ctr images export 这个命令把镜像输出到k8s_1.25.0.tar.gz文件,如果大家安装其他版本,那就不需要实现解压镜像,可以默认从网络拉取镜像即可。
ctr是containerd自带的工具,有命名空间的概念,若是k8s相关的镜像,都默认在k8s.io这个命名空间,所以导入镜像时需要指定命令空间为k8s.io

#使用ctr命令指定命名空间导入镜像
ctr -n=k8s.io images import k8s_1.25.0.tar.gz

#查看镜像,可以看到可以查询到了
crictl images

kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification

特别提醒:–image-repository registry.aliyuncs.com/google_containers为保证拉取镜像不到国外站点拉取,手动指定仓库地址为registry.aliyuncs.com/google_containers。kubeadm默认从k8s.gcr.io拉取镜像。 我们本地有导入到的离线镜像,所以会优先使用本地的镜像。
mode: ipvs 表示kube-proxy代理模式是ipvs,如果不指定ipvs,会默认使用iptables,但是iptables效率低,所以我们生产环境建议开启ipvs,阿里云和华为云托管的K8s,也提供ipvs模式,如下:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.168.1.61:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:dc08fb46387f8d7e75de415b1d5819ec694f6ae6cf556be6fae2620d4549f2d3
mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
root@master01:~# kubectl get nodes
NAME       STATUS     ROLES           AGE    VERSION
master01   NotReady   control-plane   108s   v1.26.1
root@master01:~# kubectl get pods -n kube-system
NAME                               READY   STATUS    RESTARTS       AGE
coredns-5bbd96d687-7sj78           0/1     Pending   0              51s
coredns-5bbd96d687-bk8ml           0/1     Pending   0              51s
etcd-master01                      1/1     Running   1 (113s ago)   27s
kube-apiserver-master01            1/1     Running   1 (83s ago)    115s
kube-controller-manager-master01   0/1     Running   2 (32s ago)    32s
kube-proxy-r8dj9                   1/1     Running   1 (51s ago)    51s
kube-scheduler-master01            1/1     Running   1 (113s ago)   113s

1.6、扩容k8s master节点-把xianchaomaster2添加到K8s集群

#把xianchaomaster1节点的证书拷贝到xianchaomaster2上
在xianchaomaster2创建证书存放目录:

cd /root && mkdir -p /etc/kubernetes/pki/etcd &&mkdir -p ~/.kube/

#把master01节点的证书拷贝到master02上:

scp /etc/kubernetes/pki/ca.crt master02:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key master02:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key master02:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub master02:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt master02:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key master02:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt master02:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key master02:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/ca.crt master03:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key master03:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key master03:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub master03:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt master03:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key master03:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt master03:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key master03:/etc/kubernetes/pki/etcd/

#证书拷贝之后在master2上执行如下命令,大家复制自己的,这样就可以把master2和加入到集群,成为控制节点:

kubeadm token create --print-join-command

1.8、扩容k8s集群-添加第一个工作节点
在xianchaomaster1上查看加入节点的命令:
[root@xianchaomaster1 ~]# kubeadm token create –print-join-command

显示如下:
kubeadm join 192.168.40.180:6443 –token vulvta.9ns7da3saibv4pg1 –discovery-token-ca-cert-hash sha256:72a0896e27521244850b8f1c3b600087292c2d10f2565adb56381f1f4ba7057a

把xianchaonode1加入k8s集群:
[root@xianchaonode1~]# kubeadm join 192.168.40.180:6443 –token vulvta.9ns7da3saibv4pg1 –discovery-token-ca-cert-hash sha256:72a0896e27521244850b8f1c3b600087292c2d10f2565adb56381f1f4ba7057a –ignore-preflight-errors=SystemVerification

#看到上面说明xianchaonode1节点已经加入到集群了,充当工作节点

#在xianchaomaster1上查看集群节点状况:
[root@xianchaomaster1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
xianchaomaster1 Ready control-plane,master 49m v1.25.0
xianchaomaster2 Ready control-plane,master 39s v1.25.0
xianchaomaster3 Ready control-plane,master 39s v1.25.0
xianchaonode1 Ready 39s v1.25.0

#可以对xianchaonode1打个标签,显示work
[root@xianchaomaster1 ~]# kubectl label nodes xianchaonode1 node-role.kubernetes.io/work=work
[root@xianchaomaster1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
xianchaomaster1 NotReady control-plane 10m v1.25.0
xianchaomaster2 NotReady control-plane 7m33s v1.25.0
xianchaomaster3 NotReady control-plane 6m33s v1.25.0
xianchaonode1 NotReady work 27s v1.25.0

1.9、安装kubernetes网络组件-Calico

wget https://get.helm.sh/helm-v3.10.3-linux-amd64.tar.gz
tar -zxvf helm-v3.10.3-linux-amd64.tar.gz
mv linux-amd64/helm  /usr/local/bin/
wget https://github.com/projectcalico/calico/releases/download/v3.24.5/tigera-operator-v3.24.5.tgz

helm show values tigera-operator-v3.24.5.tgz

可针对上面的配置进行定制,例如calico的镜像改成从私有库拉取。
这里只是个人本地环境测试k8s新版本,这里只有下面几行配置
apiServer:
enabled: false

helm install calico tigera-operator-v3.24.5.tgz -n kube-system –create-namespace -f values.yaml


问题:我使用calic的yaml文件clico-node一直无法ready。 Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket

注:在线下载配置文件地址是:

https://docs.projectcalico.org/manifests/calico.yaml


1.9.1 Calico架构图


Calico网络模型主要工作组件:
1.Felix:运行在每一台 Host 的 agent 进程,主要负责网络接口管理和监听、路由、ARP 管理、ACL 管理和同步、状态上报等。保证跨主机容器网络互通。
2.etcd:分布式键值存储,相当于k8s集群中的数据库,存储着Calico网络模型中IP地址等相关信息。主要负责网络元数据一致性,确保 Calico 网络状态的准确性;
3.BGP Client(BIRD):Calico 为每一台 Host 部署一个 BGP Client,即每台host上部署一个BIRD。 主要负责把 Felix 写入 Kernel 的路由信息分发到当前 Calico 网络,确保 Workload 间的通信的有效性;
4.BGP Route Reflector:在大型网络规模中,如果仅仅使用 BGP client 形成 mesh 全网互联的方案就会导致规模限制,因为所有节点之间俩俩互联,需要 N^2 个连接,为了解决这个规模问题,可以采用 BGP 的 Router Reflector 的方法,通过一个或者多个 BGP Route Reflector 来完成集中式的路由分发。 


1.9.2 calico网络插件配置文件说明
1、Daemonset配置
……
containers:
        # Runs calico-node container on each Kubernetes node. This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: docker.io/calico/node:v3.18.0
……
          env:
            # Use Kubernetes API as the backing datastore.
            - name: DATASTORE_TYPE
              value: "kubernetes"
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
         #pod网段
         - name: CALICO_IPV4POOL_CIDR 
value: "10.244.0.0/16"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"

calico-node服务的主要参数如下:
CALICO_IPV4POOL_IPIP:是否启用IPIP模式。启用IPIP模式时,Calico将在Node上创建一个名为tunl0的虚拟隧道。IP Pool可以使用两种模式:BGP或IPIP。使用IPIP模式时,设置CALICO_IPV4POOL_IPIP="Always",不使用IPIP模式时,设置CALICO_IPV4POOL_IPIP="Off",此时将使用BGP模式。

IP_AUTODETECTION_METHOD:获取Node IP地址的方式,默认使用第1个网络接口的IP地址,对于安装了多块网卡的Node,可以使用正则表达式选择正确的网卡,例如"interface=eth.*"表示选择名称以eth开头的网卡的IP地址。
-  name: IP_AUTODETECTION_METHOD
  value: "interface=ens33"

扩展:calico的IPIP模式和BGP模式对比分析
1)IPIP
把一个IP数据包又套在一个IP包里,即把IP层封装到IP层的一个 tunnel,它的作用其实基本上就相当于一个基于IP层的网桥,一般来说,普通的网桥是基于mac层的,根本不需要IP,而这个ipip则是通过两端的路由做一个tunnel,把两个本来不通的网络通过点对点连接起来;

calico以ipip模式部署完毕后,node上会有一个tunl0的网卡设备,这是ipip做隧道封装用的,也是一种overlay模式的网络。当我们把节点下线,calico容器都停止后,这个设备依然还在,执行 rmmodipip命令可以将它删除。

2)BGP
BGP模式直接使用物理机作为虚拟路由路(vRouter),不再创建额外的tunnel

边界网关协议(BorderGateway Protocol, BGP)是互联网上一个核心的去中心化的自治路由协议。它通过维护IP路由表或‘前缀’表来实现自治系统(AS)之间的可达性,属于矢量路由协议。BGP不使用传统的内部网关协议(IGP)的指标,而是基于路径、网络策略或规则集来决定路由。因此,它更适合被称为矢量性协议,而不是路由协议,通俗的说就是将接入到机房的多条线路(如电信、联通、移动等)融合为一体,实现多线单IP;

BGP 机房的优点:服务器只需要设置一个IP地址,最佳访问路由是由网络上的骨干路由器根据路由跳数与其它技术指标来确定的,不会占用服务器的任何系统;

官方提供的calico.yaml模板里,默认打开了ip-ip功能,该功能会在node上创建一个设备tunl0,容器的网络数据会经过该设备被封装一个ip头再转发。这里,calico.yaml中通过修改calico-node的环境变量:CALICO_IPV4POOL_IPIP来实现ipip功能的开关:默认是Always,表示开启;Off表示关闭ipip。
- name:  CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"

总结:
calico BGP通信是基于TCP协议的,所以只要节点间三层互通即可完成,即三层互通的环境bird就能生成与邻居有关的路由。但是这些路由和flannel host-gateway模式一样,需要二层互通才能访问的通,因此如果在实际环境中配置了BGP模式生成了路由但是不同节点间pod访问不通,可能需要再确认下节点间是否二层互通。
为了解决节点间二层不通场景下的跨节点通信问题,calico也有自己的解决方案——IPIP模式



1.10、etcd配置成高可用状态
修改xianchaomaster1、xianchaomaster2、xianchaomaster3上的etcd.yaml文件
vim /etc/kubernetes/manifests/etcd.yaml
把
- --initial-cluster=xianchaomaster1=https://192.168.40.180:2380
变成如下:
- --initial-cluster=xianchaomaster1=https://192.168.40.180:2380,xianchaomaster2=https://192.168.40.181:2380,xianchaomaster3=https://192.168.40.183:2380

修改成功之后重启kubelet:
[root@xianchaomaster1 ~]# systemctl restart kubelet
[root@xianchaomaster2 ~]# systemctl restart kubelet
[root@xianchaomaster3 ~]# systemctl restart kubelet

测试etcd集群是否配置成功:
[root@xianchaomaster1 ~]# docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes  registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt member list

显示如下,说明etcd集群配置成功:
1203cdd3ad75e761, started, xianchaomaster1, https://192.168.40.180:2380, https://192.168.40.180:2379, false
5c9f58513f7f9d01, started, xianchaomaster2, https://192.168.40.181:2380, https://192.168.40.181:2379, false
e4a737a7dcdd6fb5, started, xianchaomaster3, https://192.168.40.182:2380, https://192.168.40.182:2379, false

[root@xianchaomaster1 ~]# docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes  registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://192.168.40.180:2379,https://192.168.40.181:2379,https://192.168.40.182:2379 endpoint health  --cluster
显示如下,说明etcd集群配置成功:
https://192.168.40.181:2379 is healthy: successfully committed proposal: took = 10.808798ms
https://192.168.40.182:2379 is healthy: successfully committed proposal: took = 11.179877ms
https://192.168.40.180:2379 is healthy: successfully committed proposal: took = 12.32604ms

[root@xianchaomaster1 ~]# docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes  registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl -w table --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://192.168.40.180:2379,https://192.168.40.181:2379,https://192.168.40.182:2379 endpoint status --cluster
显示如下:


1.11、测试在k8s创建pod是否可以正常访问网络
#把busybox-1-28.tar.gz上传到xianchaonode1节点,手动解压
[root@xianchaonode1 ~]# ctr images import busybox-1-28.tar.gz
[root@xianchaomaster1 ~]# kubectl run busybox --image docker.io/library/busybox:1.28  --image-pull-policy=IfNotPresent --restart=Never --rm -it busybox -- sh

/ # ping www.baidu.com
PING www.baidu.com (39.156.66.18): 56 data bytes
64 bytes from 39.156.66.18: seq=0 ttl=127 time=39.3 ms
#通过上面可以看到能访问网络,说明calico网络插件已经被正常安装了

/ # nslookup kubernetes.default.svc.cluster.local
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes.default.svc.cluster.local
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

/ # exit #退出pod

10.96.0.10 就是我们coreDNS的clusterIP,说明coreDNS配置好了。
解析内部Service的名称,是通过coreDNS去解析的。

#注意:
busybox要用指定的1.28版本,不能用最新版本,最新版本,nslookup会解析不到dns和ip 

1.12、ctr和crictl区别

背景:在部署k8s的过程中,经常要对镜像进行操作(拉取、删除、查看等)

问题:使用过程中会发现ctr和crictl有很多相同功能,也有些不同,那区别到底在哪里?

说明:

1.ctr是containerd自带的CLI命令行工具,crictl是k8s中CRI(容器运行时接口)的客户端,k8s使用该客户端和containerd进行交互;
[root@xianchaonode1 ~]# cat /etc/crictl.yaml 
runtime-endpoint: "/run/containerd/containerd.sock"
image-endpoint: ""
timeout: 0
debug: false
pull-image-on-create: false
disable-pull-on-run: false

systemctl restart  containerd

2.ctr和crictl命令具体区别如下,也可以--help查看。crictl缺少对具体镜像的管理能力,可能是k8s层面镜像管理可以由用户自行控制,能配置pod里面容器的统一镜像仓库,镜像的管理可以有habor等插件进行处理。


1.13 模拟k8s集群故障并且快速修复

面试题:
K8s集群,公司里有3个控制节点和1个工作节点,有一个控制节点xianchaomaster1出问题关机了,修复不成功,然后我们kubectl delete nodes xianchaomaster1把xianchaomaster1移除,移除之后,我把机器恢复了,上架了,我打算还这个机器加到k8s集群,还是做控制节点,如何做?
第一步:把xianchaomaster1这个机器的etcd从etcd集群删除
[root@xianchaomaster2 ~]# docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes  registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://192.168.40.181:2379,https://192.168.40.182:2379 member list

显示如下:
71f4d437fea07af5, started, xianchaomaster3, https://192.168.40.182:2380, https://192.168.40.182:2379, false
75e64910a4405073, started, xianchaomaster1, https://192.168.40.180:2380, https://192.168.40.180:2379, false
f1cb1755ef047040, started, xianchaomaster2, https://192.168.40.181:2380, https://192.168.40.181:2379, false


通过上面结果可以看到xianchaomaster1这个机器的etcd的id是75e64910a4405073

通过如下命令:删除xianchaomaster1上的etcd
[root@xianchaomaster2 ~]# docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes  registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints=https://192.168.40.181:2379,https://192.168.40.182:2379 member remove 75e64910a4405073

第二步:就是在xianchaomaster1上,创建存放证书目录
cd /root && mkdir -p /etc/kubernetes/pki/etcd &&mkdir -p ~/.kube/

第三步:把其他控制节点的证书拷贝到xianchaomaster1上
scp /etc/kubernetes/pki/ca.crt xianchaomaster1:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key xianchaomaster1:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key xianchaomaster1:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub xianchaomaster1:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt xianchaomaster1:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key xianchaomaster1:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt xianchaomaster1:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key xianchaomaster1:/etc/kubernetes/pki/etcd/

第四步:把xianchaomaster1加入到集群
在master01上查看加入节点的命令:

kubeadm token create –print-join-command

显示如下:

kubeadm join 192.168.40.199:16443 –token zwzcks.u4jd8lj56wpckcwv
–discovery-token-ca-cert-hash sha256:1ba1b274090feecfef58eddc2a6f45590299c1d0624618f1f429b18a064cb728
–control-plane


在master上执行:
[root@xianchaomaster1 ~]#kubeadm join 192.168.40.199:16443 --token zwzcks.u4jd8lj56wpckcwv \
    --discovery-token-ca-cert-hash sha256:1ba1b274090feecfef58eddc2a6f45590299c1d0624618f1f429b18a064cb728 \
    --control-plane --ignore-preflight-errors=SystemVerification


第五步:验证xianchaomaster1是否加入到k8s集群:
[root@xianchaomaster1 ~]# kubectl get nodes
NAME              STATUS   ROLES           AGE   VERSION
xianchaomaster1   Ready    control-plane   15m   v1.25.0
xianchaomaster2   Ready    control-plane   94m   v1.25.0
xianchaomaster3   Ready    control-plane   93m   v1.25.0
xianchaonode1     Ready    work            90m   v1.25.0

测试pod是否可以正常使用coredns和calico

kubectl run busybox --image docker.io/library/busybox:1.28 --image-pull-policy=IfNotPresent --restart=Never --rm -it busybox -- sh

/ # ping www.baidu.com

PING www.baidu.com (39.156.66.18): 56 data bytes

64 bytes from 39.156.66.18: seq=0 ttl=127 time=39.3 ms

#通过上面可以看到能访问网络,说明calico网络插件已经被正常安装了

/ # nslookup kubernetes.default.svc.cluster.local

Server: 10.96.0.10

Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name: kubernetes.default.svc.cluster.local

Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

/ # exit #退出pod

10.96.0.10 就是我们coreDNS的clusterIP,说明coreDNS配置好了。

解析内部Service的名称,是通过coreDNS去解析的。

#注意:

busybox要用指定的1.28版本,不能用最新版本,最新版本,nslookup会解析不到dns和ip
文档更新时间: 2023-06-05 03:18   作者:admin