k8s集群搭建+calico系列1之kubeadm部署

    科技2022-07-10  159

    kubeadm+calico部署k8s集群

    一、配置需求

    最低配置推荐配置master:2C4G ;node:4C16G节点都是4C16G操作系统内核3.10及以上centos7.7docker-ce-18.03docker-ce-18.06或18.09 注:如果对于初学者使用虚拟机内存配置在小点也可以,但是最好最少2G内存。

    二、安装部署 所有节点配置:

    1.修改主机名称和配置hosts

    vim /etc/hostname master01

    vim /etc/hostname node01

    vim /etc/hosts 192.168.1.112 master01 192.168.1.113 node01

    2.免密设置

    ssh-keygen ssh-copy-id master01 ssh-copy-id node01

    3.安装必要软件以及升级所有软件

    yum -y install vim-enhanced wget curl net-tools conntrack-tools bind-utils socat ipvsadm ipset

    4.关闭防火墙设置iptables

    systemctl disable firewalld && systemctl stop firewalld

    iptables -F && iptables -X && iptables -Z && iptables -L && systemctl stop iptables && systemctl status iptables

    5.关闭selinux、swap

    setenforce 0 sed -i " s/SELINUX=enforcing/SELINUX=disabled/g " /etc/selinux/config swapoff -a

    vim /etc/fstab #/dev/mapper/centos-swap swap swap defaults 0 0

    6.内核参数优化

    vim /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 vm.swappiness=0

    modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf

    7.安装docker-ce 配置阿里云yum源:

    wget -O /etc/yum.repos.d/CentOS7-Aliyun.repo http://mirrors.aliyun.com/repo/Centos-7.repo

    安装docker源: cd /etc/yum.repos.d && wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

    查看docker-ce的可用资源 yum list docker-ce.x86_64 --showduplicates | sort -r 安装指定版本的docker yum install docker-ce-18.06.3.ce-3.el7 -y

    启动docker systemctl enable docker && systemctl start docker

    设置docker的cgroupdriver

    vim /etc/docker/daemon.json { "registry-mirrors":["https://5twf62k1.mirror.aliyuncs.com"], "exec-opts":["native.cgroupdriver=systemd"] }

    重启docker systemctl daemon-reload && systemctl restart docker

    查看docker信息 docker info docker info |grep Cgroup

    8.拉取镜像(master节点)

    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.3 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.3 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.3 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7

    拉取calico镜像:

    docker pull calico/node:v3.13.2 docker pull calico/cni:v3.13.2 docker pull calico/pod2daemon-flexvol:v3.13.2 docker pull calico/kube-controllers:v3.13.2

    通过以下命令对本地镜像进行重命名,使之与kubeadm要求的镜像名一致

    docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.3 k8s.gcr.io/kube-apiserver:v1.18.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.3 k8s.gcr.io/kube-controller-manager:v1.18.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.3 k8s.gcr.io/kube-scheduler:v1.18.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 k8s.gcr.io/kube-proxy:v1.18.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7 k8s.gcr.io/coredns:1.6.7

    删除镜像

    docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.3 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.3 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.3 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0

    docker images

    9.安装kubernetes 安装kubernetes初始化工具(所有节点)

    cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF

    这里要update一下,否则不生效, 执行 yum update(刚执行一点然后ctrl+c终止,也可以一直update) yum list kubeadm --showduplicates | sort -r yum install kubeadm-1.18.3-0 kubelet-1.18.3-0 kubectl-1.18.3-0 -y

    master节点: 使用kubeadm 配置文件模板 kubeadm config print init-defaults > init_default.yaml 修改init_default.yaml文件中内容使其与当前部署版本一致 找到对应配置,修改为以下对应内容:

    vim init_default.yaml localAPIEndpoint: advertiseAddress: 192.168.1.112 (master01的主机IP) bindPort: 6443 … imageRepository: k8s.gcr.io (仓库地址) kind: ClusterConfiguration kubernetesVersion: v1.18.3 (当前k8s版本) networking: dnsDomain: cluster.local

    执行如下命令查看kubeadm配置后对应镜像版本 kubeadm config images list --config init_default.yaml k8s.gcr.io/kube-apiserver:v1.18.3 k8s.gcr.io/kube-controller-manager:v1.18.3 k8s.gcr.io/kube-scheduler:v1.18.3 k8s.gcr.io/kube-proxy:v1.18.3 k8s.gcr.io/pause:3.2 k8s.gcr.io/etcd:3.4.3-0 k8s.gcr.io/coredns:1.6.7

    kubeadm初始化 所有节点执行这一行命令: systemctl enable kubelet.service

    master节点执行: 执行以下命令进行kubeadm初始化 kubeadm init --config=init_default.yaml

    [root@master01 ~]# kubeadm init --config=init_default.yaml W1024 18:29:24.760362 26930 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [init] Using Kubernetes version: v1.18.3 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.112] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [master01 localhost] and IPs [192.168.1.112 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [master01 localhost] and IPs [192.168.1.112 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" W1024 18:29:33.938015 26930 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W1024 18:29:33.938733 26930 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 25.511480 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node master01 as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: abcdef.0123456789abcdef [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.1.112:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:1c6bd285abde47e50f25e8f0ff94951355b19948955a7bbfb5e6ce77aa3bc1b9

    请备份好kubeadm init输出中的 kubeadm join 命令,后面需要这个命令来给集群添加节点

    kubeadm join 192.168.1.112:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:1c6bd285abde47e50f25e8f0ff94951355b19948955a7bbfb5e6ce77aa3bc1b9

    执行以下命令:

    mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

    查看当前mater状态

    [root@master01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01 NotReady master 9m29s v1.18.3

    node部署:还记得备份的 kubeadm join node节点执行: 拉取镜像:

    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 docker pull calico/node:v3.13.2 docker pull calico/pod2daemon-flexvol:v3.13.2

    修改镜像名称

    docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 k8s.gcr.io/kube-proxy:v1.18.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2

    删除镜像

    docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2

    加入节点

    kubeadm join 192.168.1.112:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:1c6bd285abde47e50f25e8f0ff94951355b19948955a7bbfb5e6ce77aa3bc1b9

    主节点查看

    [root@master01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01 NotReady master 11m v1.18.3 node01 NotReady <none> 14s v1.18.3

    注意这里的STATUS是NotReady,这是因为网络还没有部署好,部署好网络后状态变为Ready。

    10.网络calico部署 通过以下命令获取官方calico配置文件 curl https://docs.projectcalico.org/archive/v3.13/manifests/calico.yaml -O

    修改calico文件 vim calico.yaml 修改calico.yaml中对应配置,使配置文件中镜像名称与本地镜像一致

    # It can be deleted if this is a fresh installation, or if you have already # upgraded to use calico-ipam. - name: upgrade-ipam image: calico/cni:v3.13.2 (保持与拉取镜像仓库地址一致) -- # This container installs the CNI binaries # and CNI network config file on each node. - name: install-cni image: calico/cni:v3.13.2 (保持与拉取镜像仓库地址一致) -- # Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes # to communicate with Felix over the Policy Sync API. - name: flexvol-driver image: calico/pod2daemon-flexvol:v3.13.2 (保持与拉取镜像仓库地址一致) -- # container programs network policy and routes on each # host. - name: calico-node image: calico/node:v3.13.2 (保持与拉取镜像仓库地址一致) -- priorityClassName: system-cluster-critical containers: - name: calico-kube-controllers image: calico/kube-controllers:v3.13.2 (保持与拉取镜像仓库地址一致)

    执行部署操作 [root@master01 ~]# kubectl create -f calico.yaml configmap/calico-config created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created clusterrole.rbac.authorization.k8s.io/calico-node created clusterrolebinding.rbac.authorization.k8s.io/calico-node created daemonset.apps/calico-node created serviceaccount/calico-node created deployment.apps/calico-kube-controllers created serviceaccount/calico-kube-controllers created

    查看pod信息

    [root@master01 ~]# kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system calico-kube-controllers-555fc8cc5c-vp54b 1/1 Running 0 55s 192.168.241.67 master01 <none> <none> kube-system calico-node-pbwv8 1/1 Running 0 55s 192.168.1.112 master01 <none> <none> kube-system calico-node-xllrv 0/1 Init:0/3 0 55s 192.168.1.113 node01 <none> <none> kube-system coredns-66bff467f8-bfkd7 1/1 Running 0 27m 192.168.241.65 master01 <none> <none> kube-system coredns-66bff467f8-qc6cw 1/1 Running 0 27m 192.168.241.66 master01 <none> <none> kube-system etcd-master01 1/1 Running 0 27m 192.168.1.112 master01 <none> <none> kube-system kube-apiserver-master01 1/1 Running 0 27m 192.168.1.112 master01 <none> <none> kube-system kube-controller-manager-master01 1/1 Running 0 27m 192.168.1.112 master01 <none> <none> kube-system kube-proxy-dqtvt 1/1 Running 0 27m 192.168.1.112 master01 <none> <none> kube-system kube-proxy-t986s 0/1 ContainerCreating 0 16m 192.168.1.113 node01 <none> <none> kube-system kube-scheduler-master01 1/1 Running 0 27m 192.168.1.112 master01 <none> <none>

    现在看到 READY STATUS 还不是全部为1和Running,等几分钟再看,如果全部为1和Running就是部署成功。

    [root@master01 ~]# kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system calico-kube-controllers-555fc8cc5c-b6pvw 1/1 Running 0 8m38s 192.168.241.65 master01 <none> <none> kube-system calico-node-chnqf 1/1 Running 0 8m38s 192.168.1.112 master01 <none> <none> kube-system calico-node-s7k9g 1/1 Running 0 8m38s 192.168.1.113 node01 <none> <none> kube-system coredns-66bff467f8-bfkd7 1/1 Running 2 20h 192.168.241.68 master01 <none> <none> kube-system coredns-66bff467f8-qc6cw 1/1 Running 2 20h 192.168.241.70 master01 <none> <none> kube-system etcd-master01 1/1 Running 4 20h 192.168.1.112 master01 <none> <none> kube-system kube-apiserver-master01 1/1 Running 4 20h 192.168.1.112 master01 <none> <none> kube-system kube-controller-manager-master01 1/1 Running 1 20h 192.168.1.112 master01 <none> <none> kube-system kube-proxy-dqtvt 1/1 Running 1 20h 192.168.1.112 master01 <none> <none> kube-system kube-proxy-t986s 1/1 Running 0 20h 192.168.1.113 node01 <none> <none> kube-system kube-scheduler-master01 1/1 Running 1 20h 192.168.1.112 master01 <none> <none>

    这时再查看nodes:状态全部为Ready

    [root@master01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01 Ready master 20h v1.18.3 node01 Ready <none> 20h v1.18.3

    到此集群就部署完成了。接下来下节我们创建一个小的示例。

    三、部署过程问题汇总 1.kubectl get pods --all-namespaces -o wide

    kube-system kube-proxy-t986s 0/1 ContainerCreating 0 20h 192.168.1.113 node01 <none> <none>

    发现这里的状态一直都是ContainerCreating,查看pod的详细信息: kubectl describe pod kube-proxy-t986s -n kube-system

    Warning FailedCreatePodSandBox 20h (x71 over 20h) kubelet, node01 Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.2": Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

    通过信息k8s.gcr.io/pause:3.2可知道是node节点拉取不到这个镜像,这个一般都是获取不了的,所以我们要换成是国内的网络,解决方式如下:在node节点执行

    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 docker pull calico/node:v3.13.2 docker pull calico/pod2daemon-flexvol:v3.13.2

    修改镜像名称

    docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3 k8s.gcr.io/kube-proxy:v1.18.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2

    重置node节点 kubeadm reset

    重新添加

    kubeadm join 192.168.1.112:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:1c6bd285abde47e50f25e8f0ff94951355b19948955a7bbfb5e6ce77aa3bc1b9

    重新查看状态变为Running。

    2.pod状态一直处于Pending

    [root@master01 ~]# kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system coredns-66bff467f8-cm85k 0/1 Pending 0 13m <none> <none> <none> <none> kube-system coredns-66bff467f8-rlzn2 0/1 Pending 0 13m <none> <none> <none> <none>

    查看详细: kubectl describe pod coredns-66bff467f8-rlzn2 -n kube-system oS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: CriticalAddonsOnly node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message


    Warning FailedScheduling 62s (x13 over 13m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn’t tolerate.

    解决方法:允许master节点部署 kubectl taint nodes --all node-role.kubernetes.io/master- kubectl taint nodes --all node.kubernetes.io/not-ready-

    kubectl get pods --all-namespaces -o wide

    Processed: 0.080, SQL: 8