k8s高可用集群搭建-kubeadm方式

欢迎转载,转载请标明原文地址:https://www.jianshu.com/p/3de558d8b57a
一、环境 OS:Centos7.7
Docker:18.09.9
Kubernetes:1.17.0
kubeadm:1.17.0
kubelet-1.17.0
kubectl-1.17.0
注意 K8s与Docker版本的兼容问题,在Kubernetes 1.14后支持的docker版本更新了——The list of validated docker versions has changed. 1.11.1 and 1.12.1 have been removed. The current list is 1.13.1, 17.03, 17.06, 17.09, 18.06, 18.09.
kubeadm提供了对Master的高可用部署方案,到Kubernetes 1.13版本时Kubeadm达到GA稳定阶段,这表明kubeadm不仅能够快速部署一个符合一致性要求的Kubernetes集群,更具备足够的弹性,能够支持多种实际生产需求。在Kubernetes 1.14版本中又加入了方便证书传递的 --experimental-upload-certs 参数,减少了安装过程中的大量证书复制工作。
kubeadm提供了两种不同的高可用方案。
  • 堆叠方案:etcd服务和控制平面被部署在同样的节点中,对基础设施的要求较低,对故障的应对能力也较低

    k8s高可用集群搭建-kubeadm方式
    文章图片
    堆叠方案
最小三个Master(也称工作平面),因为Etcd使用RAFT算法选主,节点数量需要为2n+1个。
  • 外置etcd方案:etcd和控制平面被分离,需要更多的硬件,也有更好的保障能力

    k8s高可用集群搭建-kubeadm方式
    文章图片
    外置etcd方案
二、服务器
系统 IP 节点角色 CPU Memory Hostname
Centos7.7 192.168.1.201 Master 2 4 k8s-01
Centos7.7 192.168.1.202 Master 2 4 k8s-02
Centos7.7 192.168.1.203 Master 2 4 k8s-03
Centos7.7 192.168.1.204 Node 4 6 k8s-04
Centos7.7 192.168.1.205 Node 4 6 k8s-05
Centos7.7 192.168.1.251 LVS 1 1 lvs-master
Centos7.7 192.168.1.252 LVS 1 1 lvs-backup
额外的VIP:192.168.1.200
【k8s高可用集群搭建-kubeadm方式】k8s集群默认不允许将Pod副本调度到Master节点上,因此Master节点配置比Node节点配置低一些,k8s的节点最低配置要求2核4G内存,低于这个配置集群部分组件无法运行。
LVS当然也可以部署在k8s的节点机器上,为了保证集群高可用,建议还是部署在单独的机器上。
下面采用的是kubeadm的堆叠方案搭建k8s集群,也就是说如果3台Master宕了2台时,集群将不可用,可能收到如下错误信息"Error from server: etcdserver: request timed out"。
三、系统设置(所有主机) 1. 静态IP
每个节点都配置为静态ip
# 查看网卡名称 $ ip a# 进入网卡目录并修改网卡配置 $ cd /etc/sysconfig/network-scripts/ && vi ifcfg-<网卡名称> BOOTPROTO="static" NM_CONTROLLED="no" IPADDR=192.168.1.201 NETMASK=255.255.255.0 GATEWAY=192.168.1.1# 保存退出并重启网卡 $ systemctl restart network# 设置网络管理器配置 $ vi /etc/NetworkManager/NetworkManager.conf dns=none# 配置dns服务器 $ vi /etc/resolv.conf nameserver 114.114.114.114 nameserver 8.8.8.8

2. 主机名
每个节点的主机名必须不同,并且保证每个节点间可以通过主机名访问
# 查看主机名 $ hostname # 修改主机名 $ hostnamectl set-hostname # 配置host,使所有节点之间可以通过hostname互相访问 $ vi /etc/hosts 192.168.1.201 k8s-01 192.168.1.202 k8s-02 192.168.1.203 k8s-03 192.168.1.204 k8s-04 192.168.1.205 k8s-05

3. 安装依赖
# 更新yum $ yum update -y # 安装依赖包 $ yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp bind-utils

4. 关闭防火墙、swap、selinux、dnsmasq、重置iptables
# 关闭防火墙 $ systemctl stop firewalld && systemctl disable firewalld # 清理防火墙规则,设置默认转发策略 $ iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT # 关闭swap $ swapoff -a # 在/etc/fstab中删除swap的挂载 $ sed -i '/swap/s/^\(.*\)$/#\1/g' /etc/fstab # 关闭selinux,让容器可以读取主机文件系统 $ setenforce 0 # 关闭dnsmasq(否则可能导致docker容器无法解析域名) $ service dnsmasq stop && systemctl disable dnsmasq

关闭dnsmasq出现Failed to stop dnsmasq.service: Unit dnsmasq.service not loaded.错误,不需要理会。
5. 系统参数设置
# 开启ipvs模块 $ cat > /etc/sysconfig/modules/ipvs.modules < /etc/sysctl.d/kubernetes.conf < /etc/systemd/journald.conf.d/99-prophet.conf <

四、安装docker(所有节点) 这里使用rpm方式安装docker
# 创建docker目录 $ mkdir -p /opt/kubernetes/docker && cd /opt/kubernetes/docker $ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-18.09.9-3.el7.x86_64.rpm $ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-cli-18.09.9-3.el7.x86_64.rpm $ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm # 清理老版本docker $ yum remove -y docker* container-selinux # 安装rpm包 $ yum localinstall -y *.rpm # 启动docker $ systemctl start docker # 开机启动 $ systemctl enable docker # 查看分区存储空间【可选】 $ df -h 文件系统容量已用可用 已用% 挂载点 devtmpfs2.0G02.0G0% /dev tmpfs2.0G02.0G0% /dev/shm tmpfs2.0G12M2.0G1% /run tmpfs2.0G02.0G0% /sys/fs/cgroup /dev/mapper/centos-root71G1.7G66G3% / /dev/sda1976M136M773M15% /boot /dev/mapper/centos-home20G45M19G1% /home tmpfs394M0394M0% /run/user/0 # 设置graph 修改docker的数据存储目录(默认是/var/lib/docker)【可选】 # 设置cgroup driver(默认是cgroupfs,目的是与kubelet配置统一,这里也可以不设置后面在kubelet中指定使用cgroupfs) $ mkdir /docker-data $ cat < /etc/docker/daemon.json { "graph":"/docker-data", "exec-opts": ["native.cgroupdriver=systemd"] } E0F # 重启docker $ systemctl restart docker

五、安装必要工具(区分节点)
  • kubeadm:部署集群用的命令(所有节点)
  • kubelet:在集群中每台机器上都要运行的组件,负责管理pod、容器的生命周期,负责主节点和工作节点之间通信的过程(所有节点)
  • kubectl:集群管理工具(可选,只要在控制集群的节点上安装即可)
# 安装网络工具,最小化安装系统后网络工具需要手动安装 $ yum -y install net-tools # 配置yum源(科学上网的同学可以把"mirrors.aliyun.com"替换为"packages.cloud.google.com") $ cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg E0F# 安装工具 # 找到要安装的版本号 $ yum list kubeadm --showduplicates | sort -r# 安装指定版本(这里用的是1.17.0-0) $ yum install -y kubeadm-1.17.0-0 kubelet-1.17.0-0 kubectl-1.17.0-0 --disableexcludes=kubernetes# 设置kubelet的cgroupdriver(kubelet的cgroupdriver默认为systemd,如果上面没有设置docker的exec-opts为systemd,这里就需要将kubelet的设置为cgroupfs)【此处不需要执行,上面docker已经将cgroupdriver=systemd】 $ sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf# 开启自启动kubelet $ systemctl enable kubelet && systemctl start kubelet

六、安装LVS负载均衡器与keepalived高可用软件 从前文的架构图中可以看到,所有节点都需要通过负载均衡器和API Server进行通信,负载均衡器就非常重要了。这里考虑负载均衡器的性能与高可用,我们选择了LVS + keepalived。
lvs-master(192.168.1.251)
# 安装依赖 $ yum install -y ipvsadm wget curl gcc openssl-devel libnl3-devel net-snmp-devel libnfnetlink-devel # 安装keepalived,centos7通过yum下载的版本有问题,会报一个叫【TCP socket bind failed. Rescheduling】的错误 $ wget http://www.keepalived.org/software/keepalived-1.4.5.tar.gz && tar -zxvf keepalived-1.4.5.tar.gz && cd keepalived-1.4.5 && ./configure && make && make install && cd .. && rm -f keepalived-1.4.5.tar.gz && rm -rf keepalived-1.4.5 ################ keepalived负载均衡配置 ################ # 生成keepalived配置 $ cd /etc/keepalived && cat < /etc/keepalived/keepalived.conf ! Configuration File for keepalivedglobal_defs { router_id keepalived-master }vrrp_instance vip_1 { state MASTER ! 注意这是网卡名称,使用ip a命令查看自己的局域网网卡名称 interface ens33 ! keepalived主备router_id必须一致 virtual_router_id 88 ! 优先级,keepalived主节点优先级要比备节点高 priority 100 advert_int 3 ! 配置虚拟ip地址 virtual_ipaddress { 192.168.1.200 } }virtual_server 192.168.1.200 6443 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 0 protocol TCPreal_server 192.168.1.201 6443 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 6443 } } real_server 192.168.1.202 6443 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 6443 } } real_server 192.168.1.203 6443 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 6443 } } } E0F# 启动keepalived $ systemctl enable keepalived && service keepalived start# 检查keepalived状态 $ service keepalived status# 查看日志 $ journalctl -f -u keepalived# 查看虚拟ip $ ip a################ real_server配置,也就是每个Master节点机器 ################ # 创建rs脚本 $ mkdir -p /opt/rs/ && cd /opt/rs && cat < /opt/rs/rs.sh #!/bin/bash # 虚拟ip vip=192.168.1.200 # 停止以前的lo:0 ifconfig lo:0 down echo "1" > /proc/sys/net/ipv4/ip_forward echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce # 启动一个回环地址并绑定给vip ifconfig lo:0 \$vip broadcast \$vip netmask 255.255.255.255 up route add -host \$vip dev lo:0 echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce # ens33是主网卡名 echo "1" >/proc/sys/net/ipv4/conf/ens33/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/ens33/arp_announce E0F# 添加执行权限 $ chmod +x /opt/rs/rs.sh# 执行rs脚本(如果出现错误,重新执行一遍即可) $ ./rs.sh# 添加到开机启动 $ echo '/opt/rs/rs.sh'>> /etc/rc.d/rc.local # 在centos7中,/etc/rc.d/rc.local的权限被降低了,所以需要执行如下命令赋予其可执行权限 $ chmod +x /etc/rc.d/rc.local

lvs-backup(192.168.1.252)
# 安装依赖 $ yum install -y ipvsadm wget curl gcc openssl-devel libnl3-devel net-snmp-devel libnfnetlink-devel # 安装keepalived,centos7通过yum下载的版本有问题,会报一个叫【TCP socket bind failed. Rescheduling】的错误 $ wget http://www.keepalived.org/software/keepalived-1.4.5.tar.gz && tar -zxvf keepalived-1.4.5.tar.gz && cd keepalived-1.4.5 && ./configure && make && make install && cd .. && rm -f keepalived-1.4.5.tar.gz && rm -rf keepalived-1.4.5 ################ keepalived负载均衡配置 ################ # 生成keepalived配置 $ cd /etc/keepalived && cat < /etc/keepalived/keepalived.conf ! Configuration File for keepalivedglobal_defs { router_id keepalived-backup }vrrp_instance vip_1 { state BACKUP ! 注意这是网卡名称,使用ip a命令查看自己的局域网网卡名称 interface ens33 ! keepalived主备router_id必须一致 virtual_router_id 88 ! 优先级,keepalived主节点优先级要比备节点高 priority 99 advert_int 3 ! 配置虚拟ip地址 virtual_ipaddress { 192.168.1.200 } }virtual_server 192.168.1.200 6443 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 0 protocol TCPreal_server 192.168.1.201 6443 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 6443 } } real_server 192.168.1.202 6443 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 6443 } } real_server 192.168.1.203 6443 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 6443 } } } E0F# 启动keepalived $ systemctl enable keepalived && service keepalived start# 检查keepalived状态 $ service keepalived status# 查看日志 $ journalctl -f -u keepalived# 查看虚拟ip $ ip a

七、 kubeadm搭建集群(区分节点) Master节点:
k8s-01(192.168.1.201)
首个Master节点的安装步骤会长一些,后面的节点相对简单。
################## kubeadm配置 ################## # 生成kubeadm配置文件(kubeadm将使用配置文件来创建k8s) $ cd /opt/kubernetes/ && cat < /opt/kubernetes/kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterConfiguration # k8s的版本号,必须跟安装的Kubeadm版本等保持一致,否则启动报错 kubernetesVersion: v1.17.0 # docker镜像仓库地址,k8s.gcr.io需要翻墙才可以下载镜像,这里使用镜像服务器下载http://mirror.azure.cn/help/gcr-proxy-cache.html imageRepository: gcr.azk8s.cn/google_containers # 集群名称 clusterName: kubernetes # apiServer的集群访问地址,填写vip地址即可 # controlPlaneEndpoint: "192.168.1.200:6443" networking: # pod的网段 podSubnet: 10.10.0.0/16 serviceSubnet: 10.96.0.0/12 dnsDomain: cluster.local --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration # kube-proxy模式指定为ipvs,需要提前在节点上安装ipvs的依赖并开启相关模块 mode: ipvs E0F################## kubeadm创建节点 ################### 初始化节点 # --upload-certs将自动上传证书到其他的Master节点,省去了手动复制证书的步骤 $ kubeadm init --config=kubeadm-config.yaml--upload-certs# 上述命令执行完毕后可以看到如下输出,重点是kubeadm join命令,上面一条是master节点加入当前集群的命令,下面一条是node节点加入当前集群的命令,注意token的有效期是24h ps.如果没有记住命令也没关系,使用 kubeadm token list 命令可以看到生成的token列表;如果token失效,可以使用 kubeadm token create 命令生成新的token: Your Kubernetes control-plane has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/configYou should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/You can now join any number of the control-plane node running the following command on each as root:kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b \ --control-plane --certificate-key 1531224cc400b3cbe29fdff9f411b45452f25f229d31bf5e9d15771a0feae0c6Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.Then you can join any number of worker nodes by running the following on each as root:kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b################## kubectl ################### 修改配置让kubectl可用 $ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config# 此时可以查看一下k8s的pods运行状态 $ kubectl get pods --all-namespaces NAMESPACENAMEREADYSTATUSRESTARTSAGE kube-systemcoredns-6955765f44-6wfmw0/1Pending064m kube-systemcoredns-6955765f44-wxdz50/1Pending064m kube-systemetcd-k8s-011/1Running064m kube-systemkube-apiserver-k8s-011/1Running064m kube-systemkube-controller-manager-k8s-011/1Running064m kube-systemkube-proxy-4t6qd1/1Running064m kube-systemkube-scheduler-k8s-011/1Running064m # 可以看到coredns的pod状态仍然是pending,这是由于没有安装网络插件的原因################## 网络插件配置 ################### 可选的网络插件:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network # 安装calico网络插件 $ wget -P /opt/kubernetes https://docs.projectcalico.org/v3.8/manifests/calico.yaml && cd /opt/kubernetes # 编辑calico.yaml文件,修改其中的网段改为前面指定的集群网段10.10.0.0/16,默认calico的网段是192.168.0.0/16,在vi编辑器中键入 \192.168.0.0 快速定位修改处 $ vi calico.yaml > - name: CALICO_IPV4POOL_CIDR >value: "10.10.0.0/16" # 安装网络插件 $ kubectl apply -f calico.yaml # 再次检查pods的状态【coredns与calico的初始化时间较长,需要等待一会状态才会变为Running】 $ kubectl get pods --all-namespaces kube-systemcalico-kube-controllers-5c45f5bd9f-fx2xc1/1Running0110s kube-systemcalico-node-k7t8m1/1Running0110s kube-systemcoredns-6955765f44-6wfmw1/1Running083m kube-systemcoredns-6955765f44-wxdz51/1Running083m kube-systemetcd-k8s-011/1Running084m kube-systemkube-apiserver-k8s-011/1Running084m kube-systemkube-controller-manager-k8s-011/1Running084m kube-systemkube-proxy-4t6qd1/1Running083m kube-systemkube-scheduler-k8s-011/1Running084m

k8s-02(192.168.1.202)
################## kubeadm加入集群 ################### 在第一个Master节点创建完毕后,打印了Master节点加入集群的命令,这里只需要直接使用命令即可创建节点并加入集群 $ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b \ --control-plane --certificate-key 1531224cc400b3cbe29fdff9f411b45452f25f229d31bf5e9d15771a0feae0c6# 上述命令执行完毕后可以看到如下输出 This node has joined the cluster and a new control plane instance was created:* Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster.To start administering your cluster from this node, you need to run the following as a regular user:mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/configRun 'kubectl get nodes' to see this node join the cluster.################## kubectl ################### 修改配置让kubectl可用 $ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config# 此时可以查看一下k8s的pods运行状态 $ kubectl get pods --all-namespaces

k8s-03(192.168.1.203)
################## kubeadm加入集群 ################### 在第一个Master节点创建完毕后,打印了Master节点加入集群的命令,这里只需要直接使用命令即可创建节点并加入集群 $ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b \ --control-plane --certificate-key 1531224cc400b3cbe29fdff9f411b45452f25f229d31bf5e9d15771a0feae0c6# 上述命令执行完毕后可以看到如下输出 This node has joined the cluster and a new control plane instance was created:* Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster.To start administering your cluster from this node, you need to run the following as a regular user:mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/configRun 'kubectl get nodes' to see this node join the cluster.################## kubectl ################### 修改配置让kubectl可用 $ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config# 此时可以查看一下k8s的pods运行状态 $ kubectl get pods --all-namespaces

Node节点
k8s-04(192.168.1.204)
################## kubeadm加入集群 ################## # 在第一个Master节点创建完毕后,打印了node节点加入集群的命令,这里只需要直接使用命令即可创建节点并加入集群 $ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b################## kubectl ################## # 修改配置让kubectl可用 $ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config# 此时可以查看一下k8s的pods运行状态 $ kubectl get pods --all-namespaces

k8s-05(192.168.1.205)
################## kubeadm加入集群 ################## # 在第一个Master节点创建完毕后,打印了node节点加入集群的命令,这里只需要直接使用命令即可创建节点并加入集群 $ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b################## kubectl ################## # 修改配置让kubectl可用 $ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config# 此时可以查看一下k8s的pods运行状态 $ kubectl get pods --all-namespaces

验证可用 执行下面的命令,验证Kubernetes集群的相关Pod是否都正常创建并可用:
# 验证service名称能否被coreDNS正确解析(10.96.0.10是默认cluster ip网段下的coredns的地址,可以通过kubectl get svc -n kube-system查看dns的地址) $ dig kubernetes.default.svc.cluster.local @10.96.0.10 # 查看已安装的组件状态 $ kubectl get pods --all-namespaces NAMESPACENAMEREADYSTATUSRESTARTSAGE kube-systemcalico-kube-controllers-5c45f5bd9f-6szxb1/1Running04m17s kube-systemcalico-node-5c9hh1/1Running03m15s kube-systemcalico-node-8ffhc1/1Running058s kube-systemcalico-node-cxndf1/1Running053s kube-systemcalico-node-glqpl1/1Running02m22s kube-systemcalico-node-q5blx1/1Running04m17s kube-systemcoredns-6955765f44-8bcz51/1Running05m14s kube-systemcoredns-6955765f44-w4p4l1/1Running05m14s kube-systemetcd-k8s-011/1Running05m26s kube-systemetcd-k8s-021/1Running03m7s kube-systemetcd-k8s-031/1Running02m21s kube-systemkube-apiserver-k8s-011/1Running05m26s kube-systemkube-apiserver-k8s-021/1Running03m15s kube-systemkube-apiserver-k8s-031/1Running186s kube-systemkube-controller-manager-k8s-011/1Running15m26s kube-systemkube-controller-manager-k8s-021/1Running03m15s kube-systemkube-controller-manager-k8s-031/1Running097s kube-systemkube-proxy-7sh6f1/1Running053s kube-systemkube-proxy-dvf9j1/1Running058s kube-systemkube-proxy-hbc5v1/1Running02m22s kube-systemkube-proxy-pqhn61/1Running05m14s kube-systemkube-proxy-rltqg1/1Running03m15s kube-systemkube-scheduler-k8s-011/1Running15m26s kube-systemkube-scheduler-k8s-021/1Running03m14s kube-systemkube-scheduler-k8s-031/1Running097s

如果发现有状态错误的Pod,则可以执行kubectl -n=kube-system describe pod 来查看错误原因,常见的错误原因是镜像没有下载完成。
至此,通过kubeadm工具就实现了Kubernetes集群的快速搭建。如果安装失败,则可以执行kubeadm reset命令将主机恢复原状,重新执行kubeadm init命令,再次进行安装。
八、部署Dashboard管理k8s集群 安装
# 下载编排文件 $ wget -P /etc/kubernetes/addons https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-rc1/aio/deploy/recommended.yaml && cd /etc/kubernetes/addons# 删除编排文件中不使用的服务Metric Server,删除dashboard-metrics-scraper Service与Deployment。 $ vi recommended.yaml# 部署服务,打印如下日志 $ kubectl apply -f recommended.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created secret/kubernetes-dashboard-csrf created secret/kubernetes-dashboard-key-holder created configmap/kubernetes-dashboard-settings created role.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created deployment.apps/kubernetes-dashboard created service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created# 查看服务运行状态 $ kubectl get namespaces $ kubectl get deployments -n kubernetes-dashboard $ kubectl get pods --namespace kubernetes-dashboard -o wide $ kubectl get services -n kubernetes-dashboard# 创建ServiceAccount,具有集群管理员权限 $ cat < dashboard-adminuser.yaml apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard---apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard E0F# 部署服务 $ kubectl apply -f dashboard-adminuser.yaml# 获取认证Bearer Token $ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}') | grep -E '^token' | awk '{print $2}'

访问
为浏览器安装kubernetes集群的根证书,前文采用kubeadm安装的集群,根证书位于Master节点机器的/etc/kubernetes/pki/ca.crt
k8s高可用集群搭建-kubeadm方式
文章图片
右键证书文件安装-用户-受信任的根证书颁发机构 浏览器访问,打开登录页面,这里使用API Server方式访问dashboard,更多方式可以参考官方说明
https://192.168.1.200:6443/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/login
k8s高可用集群搭建-kubeadm方式
文章图片
dashboard登录页面 选择Token认证,粘贴上面获取到的token,登录即可。
k8s高可用集群搭建-kubeadm方式
文章图片
主页 九、常见问题 1. Kubeadm证书有效期1年问题
kubeadm 是 kubernetes 提供的一个初始化集群的工具,使用起来非常方便,但是它创建的 apiserver、controller-manager 等证书默认只有一年的有效期,同时 kubelet 证书也只有一年有效期,一年之后 kubernetes 将停止服务。
方法总结下来有以下几个:
1、官方推荐:一年之内 kubeadm upgrade 更新一次 kubernetes 系统。
2、坊间方法:源代码编译,使得 kubeadm 生成的证书时间边长。
3、手动更新证书( kubeadm alpha phase )。
4、启用自动轮换 kubelet 证书
自动续约证书 自动续订指的是,在用kubeadm升级控制平面时 自动更新所有证书。
如果对证书续约没有要求,并定期升级kubernetes版本,每次升级间隔时间少于1年,最佳做法是经常升级集群以确保安全。
如果不想在升级集群时续约证书,则给 kubeadm upgrade applykubeadm upgrade node 传递参数:--certificate-renewal=false
手动更新证书 在每个控制平面执行,每个Master除了CA根证书一样,其余证书都是由CA证书生成(每个节点主机名不同,因此证书的SAN值也不同):
# 检查集群证书状态,该命令显示了 所有证书的到期/剩余时间,包括在etc/kubernetes/pki目录下的客户端证书及由kubeadm嵌入到KUBECONFIG文件中的客户端证书(admin.conf,controller-manager.conf和scheduler.conf)。 # 注意:kubelet.conf未包含在上面的列表中,因为kubeadm将已将其配置为自动更新。kubeadm无法管理由外部CA签名的证书。 $ kubeadm alpha certs check-expiration CERTIFICATEEXPIRESRESIDUAL TIMECERTIFICATE AUTHORITYEXTERNALLY MANAGED admin.confJan 09, 2021 06:55 UTC360dno apiserverJan 09, 2021 06:55 UTC360dcano apiserver-etcd-clientJan 09, 2021 06:55 UTC360detcd-cano apiserver-kubelet-clientJan 09, 2021 06:55 UTC360dcano controller-manager.confJan 09, 2021 06:55 UTC360dno etcd-healthcheck-clientJan 09, 2021 06:55 UTC360detcd-cano etcd-peerJan 09, 2021 06:55 UTC360detcd-cano etcd-serverJan 09, 2021 06:55 UTC360detcd-cano front-proxy-clientJan 09, 2021 06:55 UTC360dfront-proxy-cano scheduler.confJan 09, 2021 06:55 UTC360dnoCERTIFICATE AUTHORITYEXPIRESRESIDUAL TIMEEXTERNALLY MANAGED caJan 07, 2030 06:55 UTC9yno etcd-caJan 07, 2030 06:55 UTC9yno front-proxy-caJan 07, 2030 06:55 UTC9yno# 重新生成证书 $ kubeadm alpha certs renew all# 重新控制平面,重启kubelet会自动重新创建核心组件 $ systemctl restart kubelet# 查看证书状态 $ kubeadm alpha certs check-expiration

2. kubeadm join 使用的 token 过期之后,如何加入集群
# 创建token $ kubeadm token create ll3wpn.pct6tlq66lis3uhk# 查看token $ kubeadm token list TOKENTTLEXPIRESUSAGESDESCRIPTIONEXTRA GROUPS ll3wpn.pct6tlq66lis3uhk23h2020-01-17T14:42:50+08:00authentication,signingsystem:bootstrappers:kubeadm:default-node-token# 获取 CA 证书 sha256 编码 hash 值 $ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b# 新node节点加入集群 $ kubeadm join 192.168.1.200:6443 --token ll3wpn.pct6tlq66lis3uhk \ --discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b

3. 忘记kubeadm join命令怎么办
# 执行以下命令即可打印加入集群的命令 $ kubeadm token create --print-join-command

4. 在kubeadm init阶段未开启ipvs,后续如何修改kube-proxy的模式为ipvs
# 修改ConfigMap的kube-system/kube-proxy中的config.conf中mode: "ipvs" $ kubectl edit cm kube-proxy -n kube-system ... apiVersion: v1 data: config.conf: |- apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 0 contentType: "" kubeconfig: /var/lib/kube-proxy/kubeconfig.conf qps: 0 clusterCIDR: 10.10.0.0/16 configSyncPeriod: 0s conntrack: maxPerCore: null min: null tcpCloseWaitTimeout: null tcpEstablishedTimeout: null enableProfiling: false healthzBindAddress: "" hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: null minSyncPeriod: 0s syncPeriod: 0s ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "" strictARP: false syncPeriod: 0s kind: KubeProxyConfiguration metricsBindAddress: "" # 修改此处即可 mode: "ipvs" nodePortAddresses: null oomScoreAdj: null portRange: "" udpIdleTimeout: 0s winkernel: enableDSR: false networkName: "" sourceVip: "" ...# 查看已运行的kube-proxy的Pod $ kubectl get pods -n kube-system | grep kube-proxy# 删除原有的kube-proxy的Pod,控制器会自动重建 $ kubectl get pod -n kube-system | grep kube-proxy | awk '{system("kubectl delete pod "$1" -n kube-system")}'# 通过ConfigMap修改了kube-proxy的配置,后期增加的Node节点,会直接使用ipvs模式。# 查看已运行的kube-proxy的Pod $ kubectl get pods -n kube-system | grep kube-proxy# 查看kube-proxy的Pod日志,确保运行为ipvs模式。日志中打印出了'Using ipvs Proxier,说明ipvs模式已经开启。 $ kubectl logs kube-proxy-xxxxx -n kube-system# 使用ipvsadm测试,可以查看之前创建的Service已经使用LVS创建了集群。 $ ipvsadm -Ln

    推荐阅读