腾讯云联网环境下搭建kubernetes集群

从来好事天生俭,自古瓜儿苦后甜。这篇文章主要讲述腾讯云联网环境下搭建kubernetes集群相关的知识,希望能为你提供帮助。
背景:网络环境参照:云联网体验,上海 北京两个vpc网络。服务器分布如下:

腾讯云联网环境下搭建kubernetes集群

文章图片

讲一下为什么使用TencentOS Server 3.1 (TK4)的系统。还不是因为centos8不提供长期维护了....,顺便体验一下腾讯云开源的tencentos.详情见腾讯云官网:https://cloud.tencent.com/document/product/213/38027。毕竟是与centos8兼容的,按照centos8的搭建kubernetes的流程搭建一遍kubernetes体验一下跨区域是否可行!
基本规划:注:嗯多区域打散比较也可以高可用!
ip hostname 所在区域
10.10.2.8 sh-master-01 上海2区
10.10.2.10 sh-master-02 上海2区
10.10.5.4 sh-master-03 上海5区
10.10.4.7 sh-work-01 上海4区
10.10.4.14 sh-work-02 上海4区
10.10.12.9 bj-work-01 北京5区
创建一个内网负载均衡slb,做apiserver的vip,过去一直用的传统型,现在只有应用型负载均衡了......
腾讯云联网环境下搭建kubernetes集群

文章图片

腾讯云联网环境下搭建kubernetes集群

文章图片

腾讯云联网环境下搭建kubernetes集群

文章图片

腾讯云联网环境下搭建kubernetes集群

文章图片

系统初始化注:1-12为所有节点执行
1.更改主机名注:主机名没有初始化的修改主机名
[root@VM-2-8-centos ~]# hostnamectl set-hostname sh-master-01 [root@VM-2-8-centos ~]# cat /etc/hostname sh-master-01

腾讯云联网环境下搭建kubernetes集群

文章图片

其他几台同样的方式
2. 关闭swap交换分区
swapoff -a sed -i s/.*swap.*/#& / /etc/fstab

3. 关闭selinux
[root@sh-master-01 ~]# setenforce0 ssive/SELINUX=disabled/g" /etc/selinux/configsetenforce: SELinux is disabled [root@sh-master-01 ~]# sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux [root@sh-master-01 ~]# sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config [root@sh-master-01 ~]# sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/sysconfig/selinux [root@sh-master-01 ~]# sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config

4. 关闭防火墙
systemctl disable --now firewalld chkconfig firewalld off

注:都没有安装firewalld and iptables可以忽略
5. 调整文件打开数等配置
cat> /etc/security/limits.conf < < EOF * soft nproc 1000000 * hard nproc 1000000 * soft nofile 1000000 * hard nofile 1000000 * softmemlockunlimited * hard memlockunlimited EOF

当然了貌似tencentos limits.d目录下有个80-nofile.conf,修改配置文件可以都放在这里。这样可以避免修改主文件
腾讯云联网环境下搭建kubernetes集群

文章图片

6. yumupdate
yum update yum -y installgcc bc gcc-c++ ncurses ncurses-devel cmake elfutils-libelf-devel openssl-devel flex* bison* autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmakepcre pcre-devel openssl openssl-develjemalloc-devel tlc libtool vim unzip wget lrzsz bash-comp* ipvsadm ipset jq sysstat conntrack libseccomp conntrack-tools socat curl wget git conntrack-tools psmisc nfs-utils tree bash-completion conntrack libseccomp net-tools crontabs sysstat iftop nload strace bind-utils tcpdump htop telnet lsof

当然了 我这里忽略了......我cvm初始化一般会用oneinstack的脚本完成初始化一下
7. ipvs添加tencentos的系统内核是5.4.119
:> /etc/modules-load.d/ipvs.conf module=( ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack br_netfilter ) for kernel_module in ${module[@]}; do /sbin/modinfo -F filename $kernel_module |& grep -qv ERROR & & echo $kernel_module > > /etc/modules-load.d/ipvs.conf || : done

systemctl daemon-reload systemctl enable --now systemd-modules-load.service

验证ipvs是否加载成功
# lsmod | grep ip_vs ip_vs_sh163840 ip_vs_wrr163840 ip_vs_rr163845 ip_vs15155211 ip_vs_rr,ip_vs_sh,ip_vs_wrr nf_conntrack1146885 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs nf_defrag_ipv6204802 nf_conntrack,ip_vs

8. 优化系统参数(不一定是最优,各取所有)oneinstack默认的 初始化安装的,先不改了,慢慢看 。等一会有问题了找问题
cat /etc/sysctl.d/99-sysctl.conf
fs.file-max=1000000 net.ipv4.tcp_max_tw_buckets = 6000 net.ipv4.tcp_sack = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_rmem = 4096 87380 4194304 net.ipv4.tcp_wmem = 4096 16384 4194304 net.ipv4.tcp_max_syn_backlog = 16384 net.core.netdev_max_backlog = 32768 net.core.somaxconn = 32768 net.core.wmem_default = 8388608 net.core.rmem_default = 8388608 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_fin_timeout = 20 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 net.ipv4.tcp_syncookies = 1 #net.ipv4.tcp_tw_len = 1 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_mem = 94500000 915000000 927000000 net.ipv4.tcp_max_orphans = 3276800 net.ipv4.ip_local_port_range = 1024 65000 net.nf_conntrack_max = 6553500 net.netfilter.nf_conntrack_max = 6553500 net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60 net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120 net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 net.netfilter.nf_conntrack_tcp_timeout_established = 3600

9. containerd安装dnf与yumcentos8的变化,具体的自己去看了呢。差不多吧.......,添加阿里云的源习惯了如下:
dnf install dnf-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo sudo yum update -y & & sudo yum install -y containerd.io containerd config default > /etc/containerd/config.toml # 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.tomlsandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"# 重启containerd $ systemctl daemon-reload $ systemctl restart containerd

看来还是搞不定....匹配的版本不对啊哈哈哈,咋整?
腾讯云联网环境下搭建kubernetes集群

文章图片

找一下腾讯的源试一下,当然了先删除一下阿里的源:
?
rm -rf /etc/yum.repos.d/docker-ce.repo yum clean all

https://mirrors.cloud.tencent.com/docker-ce/linux/centos/
腾讯云联网环境下搭建kubernetes集群

文章图片

dnf install dnf-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo http://mirrors.cloud.tencent.com/docker-ce/linux/centos/docker-ce.repo sudo yum update -y & & sudo yum install -y containerd.io containerd config default > /etc/containerd/config.toml # 替换 containerd 默认的 sand_box 镜像,编辑 /etc/containerd/config.tomlsandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"# 重启containerd $ systemctl daemon-reload $ systemctl restart containerd

腾讯云联网环境下搭建kubernetes集群

文章图片

依然如此.......没有自己匹配一下系统啊....咋整?手动修改一下?
腾讯云联网环境下搭建kubernetes集群

文章图片

成功了,这里也希望tencentos能够自己支持一下常用的yum源...别让我手动转换啊
腾讯云联网环境下搭建kubernetes集群

文章图片

containerd config default > /etc/containerd/config.toml

腾讯云联网环境下搭建kubernetes集群

文章图片

# 重启containerd systemctl daemon-reload systemctl restart containerd systemctl status containerd

腾讯云联网环境下搭建kubernetes集群

文章图片

10. 配置 CRI 客户端 crictl注:貌似有版本匹配的
VERSION="v1.22.0" wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin rm -f crictl-$VERSION-linux-amd64.tar.gz

也可能下不动,github下载到桌面,手动上传吧....
cat < < EOF > /etc/crictl.yaml runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock timeout: 10 debug: false EOF# 验证是否可用(可以顺便验证一下私有仓库) crictlpull nginx:alpine crictlrminginx:alpine crictlimages

嗯 修改一下/etc/containerd/config.toml 中[plugins." io.containerd.grpc.v1.cri" .registry.mirrors." docker.io" ]中endpoint为阿里云的加速器地址(当然 了也可以是其他加速器的),另外, [plugins." io.containerd.grpc.v1.cri" .containerd.runtimes.runc.options]也添加了SystemdCgroup = true
腾讯云联网环境下搭建kubernetes集群

文章图片

腾讯云联网环境下搭建kubernetes集群

文章图片

endpoint 更换为阿里云加速器地址:https://2lefsjdg.mirror.aliyuncs.com
腾讯云联网环境下搭建kubernetes集群

文章图片

重启containerd服务重新下载镜像验证:
systemctl restart containerd.service crictlpull nginx:alpine

OK
腾讯云联网环境下搭建kubernetes集群

文章图片

11. 安装 Kubeadm(centos8没有对应yum源使用centos7的阿里云yum源)注:为什么安装1.21.3版本呢?因为我线上的也是1.21.3版本的。正好到时候测试一下升级
cat < < EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # 删除旧版本,如果安装了 yum remove kubeadm kubectl kubelet kubernetes-cni cri-tools socat # 查看所有可安装版本 下面两个都可以啊 # yum list --showduplicates kubeadm --disableexcludes=kubernetes # 安装指定版本用下面的命令 # yum -y install kubeadm-1.21.3 kubectl-1.21.3 kubelet-1.21.3 or # 安装默认最新稳定版本,当前版本1.22.4 #yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes# 开机自启 systemctl enable kubelet.service

腾讯云联网环境下搭建kubernetes集群

文章图片

当然了,这里也可以直接使用腾讯云的源了....道理一样。
12. 修改kubelet配置
vi /etc/sysconfig/kubelet KUBELET_EXTRA_ARGS= --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock

master节点额外操作: 1. 安装haproxy注:三台master节点都要安装haproxy,以及相关配置......
yum install haproxy

cat < < EOF > /etc/haproxy/haproxy.cfg #--------------------------------------------------------------------- # Example configuration for a possible web application.See the # full configuration options online. # #http://haproxy.1wt.eu/download/1.4/doc/configuration.txt # #---------------------------------------------------------------------#--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events.This is done #by adding the -r option to the SYSLOGD_OPTIONS in #/etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log #file. A line like the following can be added to #/etc/sysconfig/syslog # #local2.*/var/log/haproxy.log # log127.0.0.1 local2chroot/var/lib/haproxy pidfile/var/run/haproxy.pid maxconn4000 userhaproxy grouphaproxy daemon# turn on stats unix socket stats socket /var/lib/haproxy/stats#--------------------------------------------------------------------- # common defaults that all the listen and backend sections will # use if not designated in their block #--------------------------------------------------------------------- defaults modetcp logglobal optiontcplog optiondontlognull option http-server-close option forwardforexcept 127.0.0.0/8 optionredispatch retries3 timeout http-request10s timeout queue1m timeout connect10s timeout client1m timeout server1m timeout http-keep-alive 10s timeout check10s maxconn3000#--------------------------------------------------------------------- # main frontend which proxys to the backends #--------------------------------------------------------------------- frontend kubernetes bind *:8443#配置端口为8443 mode tcp default_backend kubernetes #--------------------------------------------------------------------- # static backend for serving up images, stylesheets and such #--------------------------------------------------------------------- backend kubernetes#后端服务器,也就是说访问10.3.2.12:6443会将请求转发到后端的三台,这样就实现了负载均衡 balance roundrobin server master110.10.2.8:6443 check maxconn 2000 server master210.10.2.10:6443 check maxconn 2000 server master310.10.5.4:6443 check maxconn 2000 EOF systemctl enable haproxy & & systemctl start haproxy & & systemctl status haproxy

登陆腾讯云负载均衡管理后台:https://console.cloud.tencent.com/clb,创建TCP监听器命名k8s监听6443端口,后端服务绑定三台master节点 8443端口,权重默认10没有修改。
腾讯云联网环境下搭建kubernetes集群

文章图片

2. sh-master-01节点生成配置文件注:当然了 也可以是sh-master-02 or sh-master-03节点
kubeadm config print init-defaults > config.yaml

腾讯云联网环境下搭建kubernetes集群

文章图片

修改一下配置文件如下:
apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 10.10.2.8 bindPort: 6443 nodeRegistration: criSocket: /run/containerd/containerd.sock name: sh-master-01 taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s certSANs: - sh-master-01 - sh-master-02 - sh-master-03 - sh-master.k8s.io - localhost - 127.0.0.1 - 10.10.2.8 - 10.10.2.10 - 10.10.5.4 - 10.10.2.4 - xx.xx.xx.xx apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: "10.10.2.4:6443" controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: 1.21.3 networking: dnsDomain: cluster.local serviceSubnet: 172.31.0.0/16 scheduler: {} --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "rr" strictARP: false syncPeriod: 15s iptables: masqueradeAll: true masqueradeBit: 14 minSyncPeriod: 0s syncPeriod: 30s

增加了ipvs的配置,指定了service的subnet,还有国内的镜像仓库,xx.xx.xx.xx是我预留了一个ip(可以预留ip的,方便以后扩容主节点起码)
3. kubeadm master-01节点初始化
kubeadm init --config /root/config.yaml

注:下面截图跟上面命令不匹配,因为我开始想安装cilium来...结果失败了哈哈哈还是先搞一下calico吧
腾讯云联网环境下搭建kubernetes集群

文章图片

嗯 优化系统参数的时候没有搞上net.ipv4.ip_forward 强调一下,sysctl -w是临时的哦
sysctl -w net.ipv4.ip_forward=1

长久的还是再配置文件中加一下:
cat < < EOF > /etc/sysctl.d/99-sysctl.conf fs.file-max=1000000 net.ipv4.tcp_max_tw_buckets = 6000 net.ipv4.tcp_sack = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_rmem = 4096 87380 4194304 net.ipv4.tcp_wmem = 4096 16384 4194304 net.ipv4.tcp_max_syn_backlog = 16384 net.core.netdev_max_backlog = 32768 net.core.somaxconn = 32768 net.core.wmem_default = 8388608 net.core.rmem_default = 8388608 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_fin_timeout = 20 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 net.ipv4.tcp_syncookies = 1 #net.ipv4.tcp_tw_len = 1 net.ipv4.ip_forward = 1 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_mem = 94500000 915000000 927000000 net.ipv4.tcp_max_orphans = 3276800 net.ipv4.ip_local_port_range = 1024 65000 net.nf_conntrack_max = 6553500 net.netfilter.nf_conntrack_max = 6553500 net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60 net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120 net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 net.netfilter.nf_conntrack_tcp_timeout_established = 3600 EOFsysctl --system

注:所有节点执行
kubeadm init --config /root/config.yaml

腾讯云联网环境下搭建kubernetes集群

文章图片

4. sh-master-02,sh-master-03控制平面节点加入集群
mkdir -p $HOME/.kube mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config按照输出sh-master-02 ,sh-master-03节点加入集群 将sh-master-01 /etc/kubernetes/pki目录下ca.* sa.* front-proxy-ca.* etcd/ca* 打包分发到sh-master-02,sh-master-03 /etc/kubernetes/pki目录下 kubeadm join 10.10.2.4:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:ccfd4e2b85a6a07fde8580422769c9e14113e8f05e95272e51cca2f13b0eb8c3 --control-plan 然后同sh-master-01一样执行一遍下面的命令: mkdir -p $HOME/.kube sudo \\cp /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

腾讯云联网环境下搭建kubernetes集群

文章图片

腾讯云联网环境下搭建kubernetes集群

文章图片

kubectl get nodes

嗯 由于没有安装cni 网络插件都是notready状态。
腾讯云联网环境下搭建kubernetes集群

文章图片

work节点加入集群
kubeadm join 10.10.2.4:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:ccfd4e2b85a6a07fde8580422769c9e14113e8f05e95272e51cca2f13b0eb8c3

腾讯云联网环境下搭建kubernetes集群

文章图片

腾讯云联网环境下搭建kubernetes集群

文章图片

?
首先cnn管理控制台先购买了1Mbps的带宽,毕竟是做一下测试:
腾讯云联网环境下搭建kubernetes集群

文章图片

安装cni网络插件初步先跑一下简单的calico了(搞flannel cilium开始没有整起来。先跑通一个算一个。其他的后面慢慢学习优化)
curl https://docs.projectcalico.org/v3.11/manifests/calico.yaml -O

sed -i -e "s?192.168.0.0/16?172.31.0.0/16?g" calico.yaml

kubectl apply -f calico.yaml kubectl get pods -o kube-system -o wide

腾讯云联网环境下搭建kubernetes集群

文章图片

注: 我还额外在腾讯云私有网络控制台添加了辅助cidr,我在想这样的话我跟其他区域的容器网络是不是也可以互通?还没有测试....就是想起来添加一下了:
[
](https://console.cloud.tencent.com/vpc/vpc?rid=4)
腾讯云联网环境下搭建kubernetes集群

文章图片

做一下简单的ping测试: 1. 上海区部署两个pod
cat< < EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: nginx spec: selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx:alpine name: nginx ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: nginx spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: v1 kind: Pod metadata: name: busybox namespace: default spec: containers: - name: busybox image: busybox:1.28.4 command: - sleep - "3600" imagePullPolicy: IfNotPresent restartPolicy: Always EOF

嗯 都跑在了上海区
[root@sh-master-01 ~]# kubectl get pods -o wide NAMEREADYSTATUSRESTARTSAGEIPNODENOMINATED NODEREADINESS GATES busybox1/1Running1414h172.31.45.132sh-work-01< none> < none> nginx-7fb7fd49b4-zrg771/1Running014h172.31.45.131sh-work-01< none> < none>

2. nodeSelector调度在北京区启动一个pod然后我还想启动一个pod运行在北京区,怎么搞?偷个懒 打标签,nodeSelector调度吧!
kubectl label node bj-work-01zone=beijing

cat nginx1.yaml
apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: nginx1 name: nginx1 spec: nodeSelector:#将pod部署到指定标签为zone 为beijing的节点上 zone: "beijing" containers: - image: nginx name: nginx1 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}

kubectl apply -f nginx1.yaml

[root@sh-master-01 ~]# kubectl get pods -o wide NAMEREADYSTATUSRESTARTSAGEIPNODENOMINATED NODEREADINESS GATES busybox1/1Running1414h172.31.45.132sh-work-01< none> < none> nginx-7fb7fd49b4-zrg771/1Running014h172.31.45.131sh-work-01< none> < none> nginx11/1Running014h172.31.89.194bj-work-01< none> < none>

3. ping 测试【腾讯云联网环境下搭建kubernetes集群】在sh-master-02节点ping 北京pod与上海pod的ping值
腾讯云联网环境下搭建kubernetes集群

文章图片
exec 上海的pod ping 上海与北京的pod的ping值
腾讯云联网环境下搭建kubernetes集群

文章图片

基本都是差不多的样子。主要是想验证一下是否可以跨区域vpc去搭建kubernetes集群的可行性。网络质量什么的还没有想好怎么测试。只是抛砖引玉。云上是很大程度上方便了许多。起码bgp什么的配置的都相对省略了。如果有云上跨区域搭建kubernetes集群的可以参考一下。

    推荐阅读