本文将试验在虚拟机上用CentOS8.1系统搭建典型的K8s集群(1Master + 2Worker)K8s版本1.19.4-0

以下内容如没有特别说明,意味着所有参与集群的服务器都要配置。

1. 环境准备

本次搭建用了三台vmware虚拟机,操作系统全部安装centos8.1

# uname -a
Linux nsk8s210 4.18.0-147.el8.x86_64 #1 SMP Wed Dec 4 21:51:45 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# 修改三台机器的 /etc/hosts
192.168.139.11 k8smaster11	# Master
192.168.139.12 k8snode12	# Node1
192.168.139.13 k8snode13	# Node2

系统默认有很多源,我们都删掉,以后需要就逐个添加国内的源方便下载包。删完之后首先加的第一个是系统源

cd /etc/yum.repos.d
rm -rf *

wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-8.repo
yum -y install epel-release	# 很多包都找不到,需要启动EPEL源
yum makecache 				# 重建源上所有的缓存

chrony同步时间

yum install -y chrony
sed -i -e '/^server/s/^/#/' -e '1a server ntp.aliyun.com iburst' /etc/chrony.conf
systemctl start chronyd.service
systemctl enable chronyd.service
timedatectl set-timezone Asia/Shanghai
timedatectl set-local-rtc 0 # 保存当前时间
timedatectl

关闭Swap

  • 暂时关闭 使用命令sudo swapoff -a,但是重启之后会生效,这会导致k8s无法正常运行。
  • 永久关闭,vi /etc/fstab将有swap 那行注释掉,保存即可。
1
2
3
4
# 如果普通方式无法禁用 swap ,就用下面这句
systemctl mask dev-sdXX.swap
# 取消禁用可以
systemctl unmask dev-sdXX.swap

swap是啥?它是系统的交换分区,就是windows上常说的虚拟内存。当系统内存不足的时候,会将一部分硬盘空间虚拟成内存使用。那为什么K8S需要将其关掉呢?其实是为了性能考虑,因为虚拟内存慢很多,K8S希望所有的服务都不应该超过集群或节点CPU和内存的限制。

接下来要关闭系统防火墙、SElinux等,限制太多问题就变复杂了;这是常规操作,不细说。

安装Docker

参考以前的安装介绍:/2020/11/30113656-docker-cmd.html

配置内核参数

# 配置sysctl内核参数
$ cat > /etc/sysctl.conf <<EOF
vm.max_map_count=262144
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
$ sysctl -p

# 配置iptable
vi /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
# 保存后执行
sysctl --system

# 开启 ipvs
yum install -y nfs-utils ipset ipvsadm # 安装ipvs支持
# 配置
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
# 执行下面三条
chmod 755 /etc/sysconfig/modules/ipvs.modules
bash /etc/sysconfig/modules/ipvs.modules 
# 检查
lsmod | grep -e ip_vs -e nf_conntrack_ipv4

# 加载 br_netfilter 模块
[root@nsk8s210 home]# modprobe br_netfilter

# 查看是否加载
[root@nsk8s210 home]# lsmod | grep br_netfilter
br_netfilter           24576  0
bridge                192512  1 br_netfilter

服务器互信(非必须)

# 配置ssh互信,那么节点之间就能无密访问,方便日后执行自动化部署
ssh-keygen # 每台机器执行这个命令, 一路回车即可
cd /root/.ssh/
mv id_rsa.pub id_rsa.pub.11
scp -P xxx ip.pub.11 root@192.168.139.11:/root/.ssh/
scp -P xxx ip.pub.11 root@192.168.139.12:/root/.ssh/

# 每台节点执行
cd /root/.ssh/
cat id_rsa.pub.11 >> authorized_keys

安装 kubeadm kubelet kubectl

所有节点都需要安装这三个工具,k8s的官方源:https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 ,国内无法访问,找万能的阿里爸爸

# cd /etc/yum.repos.d
# vi k8s.repo
# 加入下面的几行,保存退出
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg \
	https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

可以先查看下是否能找到相应的包,有的话直接安装

yum list |grep kubeadm	# 查看是否存在
yum install -y kubeadm	# 这一个命令就会安装kubeadm,kubelet,kubectl三个组件的最新版本
# 但是版本不一致常常出现些问题,最好还是指定版本
yum install -y kubelet-1.19.4 kubeadm-1.19.4 kubectl-1.19.4 --disableexcludes=kubernetes

# 如果安装失败,报不可信什么的错误,就重建一次源缓存。
yum clean all
yum makecache

安装完后发现我的版本是1.19.4-0,已经比较新了。而且上面一个命令很多包都一起装上了。

# 如果前面对docker的配置加入了下面这句,那么kubelet的相应配置也要修改
# "exec-opts": ["native.cgroupdriver=systemd"],
[root@k8smaster11 k8s]# vi /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"

systemctl enable kubelet.service # 服务设置为自动启动

2. Master配置

在Master上安装必要的K8s程序,当然是以docker的方式提供的

[root@k8smaster11 k8s]# kubeadm config images list
I1209 17:19:52.308789    8099 version.go:252] remote version is much newer: v1.20.0; \
	falling back to: stable-1.19
W1209 17:19:58.092894    8099 configset.go:348] WARNING: kubeadm cannot validate component \
	configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
k8s.gcr.io/kube-apiserver:v1.19.4
k8s.gcr.io/kube-controller-manager:v1.19.4
k8s.gcr.io/kube-scheduler:v1.19.4
k8s.gcr.io/kube-proxy:v1.19.4
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns:1.7.0

同样需要找阿里云的资源,这里做一个批处理的脚本文件docker_master.sh

mkdir -p /root/k8s/	# 新建一个临时目录,存放临时文件
vi docker_master.sh	# 贴入下面的脚本
#!/bin/bash

docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.19.4
docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.19.4 k8s.gcr.io/kube-apiserver:v1.19.4
docker rmi registry.aliyuncs.com/google_containers/kube-apiserver:v1.19.4

docker pull registry.aliyuncs.com/google_containers/kube-controller-manager:v1.19.4
docker tag registry.aliyuncs.com/google_containers/kube-controller-manager:v1.19.4 \
	k8s.gcr.io/kube-controller-manager:v1.19.4
docker rmi registry.aliyuncs.com/google_containers/kube-controller-manager:v1.19.4

docker pull registry.aliyuncs.com/google_containers/kube-scheduler:v1.19.4
docker tag registry.aliyuncs.com/google_containers/kube-scheduler:v1.19.4 k8s.gcr.io/kube-scheduler:v1.19.4
docker rmi registry.aliyuncs.com/google_containers/kube-scheduler:v1.19.4

docker pull registry.aliyuncs.com/google_containers/kube-proxy:v1.19.4
docker tag registry.aliyuncs.com/google_containers/kube-proxy:v1.19.4 k8s.gcr.io/kube-proxy:v1.19.4
docker rmi registry.aliyuncs.com/google_containers/kube-proxy:v1.19.4

docker pull registry.aliyuncs.com/google_containers/etcd:3.4.13-0
docker tag registry.aliyuncs.com/google_containers/etcd:3.4.13-0 k8s.gcr.io/etcd:3.4.13-0
docker rmi registry.aliyuncs.com/google_containers/etcd:3.4.13-0

docker pull registry.aliyuncs.com/google_containers/pause:3.2
docker tag registry.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2
docker rmi registry.aliyuncs.com/google_containers/pause:3.2

docker pull coredns/coredns:1.7.0
docker tag coredns/coredns:1.7.0 k8s.gcr.io/coredns:1.7.0
docker rmi coredns/coredns:1.7.0

执行上面的批处理文件,如果个别网络原因出错,可以重复执行;完成之后能看到下面的结果

[root@k8smaster11 k8s]# docker images
REPOSITORY                           TAG        IMAGE ID       CREATED        SIZE
k8s.gcr.io/kube-proxy                v1.19.4    635b36f4d89f   3 weeks ago    118MB
k8s.gcr.io/kube-apiserver            v1.19.4    b15c6247777d   3 weeks ago    119MB
k8s.gcr.io/kube-controller-manager   v1.19.4    4830ab618586   3 weeks ago    111MB
k8s.gcr.io/kube-scheduler            v1.19.4    14cd22f7abe7   3 weeks ago    45.7MB
k8s.gcr.io/etcd                      3.4.13-0   0369cf4303ff   3 months ago   253MB
k8s.gcr.io/coredns                   1.7.0      bfe3a36ebd25   5 months ago   45.2MB
k8s.gcr.io/pause                     3.2        80d28bedfe5d   9 months ago   683kB

接下来开始配置K8s Master服务

# 1. master 初始化操作 
# 注意这里 10.244.0.0的IP最好先不改,后面要用到
kubeadm init --kubernetes-version=1.19.4 \
--apiserver-advertise-address=192.168.139.11 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16

# 2. 结果如下
......
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.139.11:6443 --token ql6gfh.8ls3339vfdzd8br0 \
    --discovery-token-ca-cert-hash sha256:2d72b3b680cbef4380a7dd04870a910c14f4bfa60f464cd427fa82d24ba98a25
    
# 3. 执行上面输出的3条命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 4. 一定要记住这几行信息,节点加入集群需要验证
kubeadm join ......
# 如果没记住,可以在master上执行下面这句,重新生成
kubeadm token create --print-join-command
# 还有个问题 token 默认有效期是24小时,失效后,node也是加不进来的。需要重新生成token
kubeadm token list # 查看当前有效的token
    
# 5. 配置kubetl认证信息
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile

# 6. 查看一下集群pod,确认个组件都处于 Running 状态
# 由于master节点还差一些配置,coredns暂时无法正常启动
[root@k8smaster11 ~]# kubectl get pod -n kube-system
NAME                                  READY   STATUS    RESTARTS   AGE
coredns-f9fd979d6-kcvlg               0/1     Pending   0          10m
coredns-f9fd979d6-nwj2h               0/1     Pending   0          10m
etcd-k8smaster11                      1/1     Running   0          10m
kube-apiserver-k8smaster11            1/1     Running   0          10m
kube-controller-manager-k8smaster11   1/1     Running   0          10m
kube-proxy-kr5t8                      1/1     Running   0          10m
kube-scheduler-k8smaster11            1/1     Running   0          10m
# 更详细:kubectl get pod --all-namespaces -o wide

# 7. 此时主节点状态
[root@k8smaster11 ~]# kubectl get nodes
NAME          STATUS     ROLES    AGE   VERSION
k8smaster11   NotReady   master   18m   v1.19.4

# 8. 配置flannel网络
mkdir -p /root/k8s/
cd /root/k8s
# 
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 此yml中的"Network": "10.244.0.0/16" 要和 kubeadm init 中 --pod-network-cidr 一样;
# 否则可能会使得Node间 Cluster IP 不通。

# 查看需要下载的镜像
cat kube-flannel.yml |grep image|uniq
# image: quay.io/coreos/flannel:v0.13.1-rc1
# 下载镜像
docker pull quay.io/coreos/flannel:v0.13.1-rc1 
# 部署插件
kubectl apply -f kube-flannel.yml
# PS:卸载插件用 kubectl delete -f kube-flannel.yml !!!不要轻易用,可能删除不干净。

# 9. 系统开始自己启动服务,稍等片刻,再查询主节点状态
[root@k8smaster11 k8s]# kubectl get nodes
NAME          STATUS   ROLES    AGE   VERSION
k8smaster11   Ready    master   35m   v1.19.4

# 下面的结果才算正常
[root@k8smaster11 ~]# kubectl get pod --all-namespaces 
NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE
kube-system   coredns-f9fd979d6-kcvlg               1/1     Running   1          19h
kube-system   coredns-f9fd979d6-nwj2h               1/1     Running   1          19h
kube-system   etcd-k8smaster11                      1/1     Running   1          19h
kube-system   kube-apiserver-k8smaster11            1/1     Running   2          19h
kube-system   kube-controller-manager-k8smaster11   1/1     Running   0          91m
kube-system   kube-flannel-ds-crv75                 1/1     Running   1          18h
kube-system   kube-proxy-kr5t8                      1/1     Running   2          19h
kube-system   kube-scheduler-k8smaster11            1/1     Running   1          90m

# 10. 试一试设置命令补全(Worker机器也适用)
yum install -y bash-completion
source <(kubectl completion bash) # 重启失效
echo "source <(kubectl completion bash)" >> ~/.bashrc # 重启生效
source  ~/.bashrc

可能会出现下面这种警告,附上解决办法。

# 查询集群健康状态
kubectl get cs

# scheduler            Unhealthy
# controller-manager   Unhealthy
# 这是/etc/kubernetes/manifests下的kube-controller-manager.yaml和kube-scheduler.yaml设置的默认端口是0
# - --port=0
# 注释掉就可以了,之后三台机器都重启kubelet
systemctl status kubelet.service
systemctl restart kubelet.service
# 参考:https://llovewxm1314.blog.csdn.net/article/details/108458197

3. Worker Node配置

3.1 Worker上也要安装两个基础的容器

docker pull registry.aliyuncs.com/google_containers/pause:3.2
docker tag registry.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2
docker rmi registry.aliyuncs.com/google_containers/pause:3.2

# worker服务器上的版本可以和master不一样
docker pull registry.aliyuncs.com/google_containers/kube-proxy:v1.19.4
docker tag registry.aliyuncs.com/google_containers/kube-proxy:v1.19.4 k8s.gcr.io/kube-proxy:v1.19.4
docker rmi registry.aliyuncs.com/google_containers/kube-proxy:v1.19.4

3.2 Worker节点支持K8s命令

类似Master的K8s命令也可以在Worker节点执行:

1
2
3
4
5
6
# 1. copy admin.conf文件 到 workder 节点
scp -P 22 /etc/kubernetes/admin.conf root@10.10.xx.xx:/etc/kubernetes

# 2. 设置环境变量
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile

3.3 注册新节点

# 当前Worker加入集群
kubeadm join 192.168.139.11:6443 --token ql6gfh.8ls3339vfdzd8br0 \
    --discovery-token-ca-cert-hash sha256:2d72b3b680cbef4380a7dd04870a910c14f4bfa60f464cd427fa82d24ba98a25

# 在Master中检查集群的状态:
kubectl cluster-info
kubectl get cs
kubectl get pod  --all-namespaces
kubectl get nodes
kubectl get service
kubectl get serviceaccounts

3.4 删除节点

依次在Master和Worker服务器上执行下面几条,然后重新走 3.1的注册流程

# [root@k8smaster11 ~]#
kubectl delete nodes k8snode13

# [root@k8snode13 ~]# 
docker ps -qa | xargs docker rm -f
rm -rf /etc/kubernetes/kubelet.conf 
systemctl restart docker.service kubelet.service
rm -rf /etc/kubernetes/pki/ca.crt

4. 问题汇总

如何重置Master

# 重置Master上的k8s,重新从 init 开始,这种没有清理干净一般会遇到很多问题,谨慎使用
kubeadm reset

Worker加入新的Master

# 已经使用过的Workder节点加入新的Master,可能会遇到无法加入报错的问题:
# [kubelet-check] Initial timeout of 40s passed.
# timed out waiting for the condition
# error uploading crisocket

# 解决办法就是重置。重置Worker节点比较容易,一般不会出问题。
kubeadm reset
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X # will reset iptables

Master上显示worker的角色名称,通过Label标记:

kubectl label nodes <your_node> node-role.kubernetes.io/worker=worker

5. 简单示例

Pod是K8s中你可以创建和部署的最小也是最简的单位。Pod中封装着应用的容器(也可以是好几个容器)、存储、独立的网络IP,管理容器如何运行的策略选项。

5.1 一个Pod中运行一个容器

下面我们在刚配置好的K8s集群中,创建一个nginx容器nginx-svc.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
apiVersion: v1
kind: Pod
metadata:
  name: nginx-test
  labels:
    app: web
spec:
  containers:
  - name: front-end
    image: nginx:1.7.9
    ports:
    - containerPort: 80

运行命令

1
2
3
4
kubectl create -f ./nginx-svc.yaml # 创建Pod
kubectl get po # 查看Pod
kubectl describe po nginx-test # 查看Pod详细情况
kubectl exec -it nginx-test /bin/bash # 进入到Pod(容器)内部

5.1 一个Pod中运行多个容器

配置文件nginx-redis-svc.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
apiVersion: v1
kind: Pod
metadata:
  name: rss-site
  labels:
    app: rss-web
spec:
  containers:
    - name: front-nginx
      image: nginx:1.7.9
      ports:
        - containerPort: 80
    - name: rss-reader
      image: redis
      ports:
        - containerPort: 88

运行命令

1
2
3
4
5
kubectl create -f ./nginx-redis-svc.yaml
kubectl get po
kubectl describe po rss-site 
kubectl exec -it rss-site -c front-nginx /bin/bash 
kubectl exec -it rss-site -c rss-reader /bin/bash

这里重点说一下,一个Pod中有两个容器,他们公用同一个根pause

image-20210129184508415

参考:

https://www.cnblogs.com/wenyang321/p/14050893.html

(完)