KubeEdge搭建 环境配置 使用mininet搭建模拟网络环境
虚拟机硬件配置
型号
硬盘
内存
CPU
vritualbox虚拟机
60G
2G
Intel i5-11320H 2核 3.2GHz
vritualbox虚拟机
60G
2G
Intel i5-11320H 2核 3.2GHz
vritualbox虚拟机
60G
2G
Intel i5-11320H 2核 3.2GHz
vritualbox虚拟机
60G
2G
Intel i5-11320H 2核 3.2GHz
软件配置,由于兼容性问题,使用kubernetes1.23,KubeEdge1.12.2
主机名
IP
操作系统
master
10.0.1.200(192.168.80.134)
Ubuntu20.04
node1
192.168.0.201(192.168.80.135)
Ubuntu20.04
node2
192.168.0.202(192.168.80.136)
Ubuntu20.04
node3
192.168.0.203
Ubuntu20.04
初始化系统配置 所有主机修改主机名
1 2 sudo hostnamectl set-hostname master reboot
所有主机关闭防火墙
1 2 sudo systemctl stop ufw sudo systemctl disable ufw
所有主机禁用swap
1 2 3 4 5 sudo vi /etc/fstab sudo swapon -a sudo swapoff -a sudo swapon -s
所有主机设置时间同步
1 2 3 sudo apt install -y ntpdate sudo ntpdate time.windows.com sudo timedatectl set-timezone Asia/Shanghai
所有节点添加hosts
1 2 3 4 5 6 7 8 sudo vi /etc/hosts 10.0.1.200 master 192.168.0.201 node1 192.168.0.202 node2 192.168.0.203 node3 185.199.108.133 raw.githubusercontent.com
启用ipv4转发
1 2 3 4 sudo vi /etc/sysctl.conf /etc/sysctl.conf: net.ipv4.ip_forward = 1 sudo sysctl -p /etc/sysctl.conf
安装Docker 我们选择安装docker.io ubuntu的版本,省事
记住,需要在所有4台节点上都安装docker
1 sudo apt install docker.io
docker官方镜像仓库访问比较慢,可以使用dockerhub国内源加速
1 2 3 4 5 6 7 8 9 10 11 12 sudo mkdir -p /etc/docker sudo tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors" : ["https://knjsrl1b.mirror.aliyuncs.com" ,"https://docker.hub.com" ] } EOF sudo systemctl daemon-reload sudo systemctl restart docker
主机安装Kubernetes 考虑到兼容性,我们选择kubernetes1.23.17进行安装
根据阿里云的教程 ,在1台云主机上 ,使用阿里源安装kubelet,kubeadm和kubectl组件
1 2 3 4 5 6 7 8 sudo apt-get update && apt-get install -y apt-transport-https sudo curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add - sudo vim /etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main sudo apt update sudo apt install -y kubelet=1.23.17-00 kubeadm=1.23.17-00 kubectl=1.23.17-00
在云master主机上使用kubeadm创建kubernetes集群,这里我们使用阿里云的镜像进行加速,这里kubeadm会安装和自己版本匹配的kubernetes
如果主机的IP就是公网IP,那么初始化如下:
1 2 3 4 5 sudo kubeadm init \ --apiserver-advertise-address=192.168.132.100 \ --image-repository registry.aliyuncs.com/google_containers \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16
如果云的服务器的公网IP在主机上看不到,因此这里选择让apiserver监听所有网卡的地址,并且添加额外的公网IP作为认证许可的IP
1 2 3 4 5 6 sudo kubeadm init \ --apiserver-advertise-address=0.0.0.0 \ --apiserver-cert-extra-sans=139.9.72.62 \ --image-repository registry.aliyuncs.com/google_containers \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16
执行完毕会输出很多提示指令需要我们执行
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME /.kube sudo cp -i /etc/kubernetes/admin.conf $HOME /.kube/config sudo chown $(id -u):$(id -g) $HOME /.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.132.100:6443 --token 20vasa.tus6j1y6edbm6e1i \ --discovery-token-ca-cert-hash sha256:830a4e14fdecfb8c9eb7143fe44a1abb8bc68956d959b55447b2e7a1d6e61d85
我们按照提示在普通用户和root用户下都执行一次,这样kubectl就可以访问到本地的kube-api-server了
1 2 3 mkdir -p $HOME /.kube sudo cp -i /etc/kubernetes/admin.conf $HOME /.kube/config sudo chown $(id -u):$(id -g) $HOME /.kube/config
我们接着安装CNI网络插件,下载太慢了可以使用这个网站 查询raw.githubusercontent.com的IP地址并且写入hosts文件。
1 2 3 4 5 6 7 8 wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml vi kube-flannel.yml ...... - key: node-role.kubernetes.io/edge operator: DoesNotExist ...... kubectl apply -f kube-flannel.yml
执行 kubectl get pods -n kube-flannel如果出现如下说明网络插件安装成功
1 2 3 ... kube-flannel-ds-hgn9l 1/1 Running 0 44m ...
让master也作为工作节点可以运行用户Pod
1 kubectl taint node master node-role.kubernetes.io/master-
让master不参与运行用户Pod
1 kubectl taint node master node-role.kubernetes.io/master=:NoSchedule
过一会,在master主机上执行 kubectl get nodes,如下则加入成功
1 2 3 kubectl get nodekub NAME STATUS ROLES AGE VERSION master Ready control-plane,master 13m v1.22.15
在 Kubernetes 集群中创建一个 pod,验证是否正常运行
1 2 3 4 5 6 7 8 9 kubectl create deployment nginx --image=nginx kubectl expose deployment nginx --port=80 --type =NodePort kubectl get pod,svc NAME READY STATUS RESTARTS AGE pod/nginx-6799fc88d8-hf2m9 1/1 Running 0 22s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 14m service/nginx NodePort 10.104.74.138 <none> 80:31332/TCP 8s
可以看到nginx暴露的端口号为31332,因此我们访问地址:http://39.106.4.225:31332可以成功访问nginx首页
http://192.168.80.128:31017
安装KubeEdge kubeEdge和kubernetes类似,提供了keadm工具用来快速搭建kubeedge集群,我们可以提前在KubeEdge的github官网上面下载keadm1.12.2
1 wget https://github.com/kubeedge/kubeedge/releases/download/v1.13.1/keadm-v1.13.1-linux-amd64.tar.gz
在每个节点上 安装keadm
1 2 tar -xvf keadm-v1.12.2-linux-长度amd64.tar.gz sudo mv keadm-v1.12.2-linux-amd64/keadm/keadm /usr/bin/
云端安装 使用keadm安装kubeedge的云端组件cloudcore
如果速度慢可以提前拉取cloudcore镜像
sudo docker pull kubeedge/cloudcore:v1.13.1
1 2 3 4 5 6 7 8 9 10 sudo keadm init --advertise-address=192.168.43.118 --profile version=v1.13.1 Kubernetes version verification passed, KubeEdge installation will start... CLOUDCORE started =========CHART DETAILS======= NAME: cloudcore LAST DEPLOYED: Thu Nov 3 11:05:24 2022 NAMESPACE: kubeedge STATUS: deployed REVISION: 1
–advertise-address=xxx.xx.xx.xx 这里的xxx.xx.xx.xx换成云主机的公网地址,–profile version=v1.12.1 意思是指定安装的kubeEdge的版本,如果默认不指定那么keadm会自动去下载最新的版本
注意,这个命令会从仓库下载cloudcore容器镜像
我们可以看到cloudcore的Pod和service已经在运行了,cloudcore会监听本地的10000-10004端口
1 2 3 4 5 6 kubectl get pod,svc -n kubeedge NAME READY STATUS RESTARTS AGE pod/cloudcore-5768d46f8d-fqdcn 1/1 Running 0 78s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cloudcore ClusterIP 10.99.61.17 <none> 10000/TCP,10001/TCP,10002/TCP,10003/TCP,10004/TCP 78s
获得边缘设备接入的token
1 2 sudo keadm gettoken 0825d1d733ec84877374418cc4ecd379501efe7fe1c778e91022367c834a22a6.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2OTM0NDg5NzV9.ws0B17aZrvhSL0mu1FEVElnewTFGrh5MNn_4reBgbNA
边缘节点安装 可以提前拉取镜像(提前拉取没有用)
1 sudo docker pull kubeedge/installation-package:v1.13.0
加入集群,keadm会安装edgecore和mqtt协议的实现软件mosquitto,mosquitto会监听localhost:1183端口
1.12
1 sudo keadm join --cloudcore-ipport=192.168.43.117:10000 --kubeedge-version=1.12.2 --token=f17ab9d16aa9b82249d2242101759e257e44970f58b347e61155ea0c34f836a4.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2NzY4MTc5ODF9.60yCItWyPoNJrIjEBZxNzcQlqTQuiLYkF3Ky9zQ16Ps
由于1.13版本默认容器运行时为containerd,可手动指定runtimetype为docker
1 sudo keadm join --cloudcore-ipport=192.168.43.118:10000 --kubeedge-version=1.13.1 --runtimetype=docker --token=0825d1d733ec84877374418cc4ecd379501efe7fe1c778e91022367c834a22a6.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2OTM0NDg5NzV9.ws0B17aZrvhSL0mu1FEVElnewTFGrh5MNn_4reBgbNA
--cloudcore-ipport是边缘节点能访问的云master主机的IP端口号,--token是上面云matster生成的识别码
安装成功
1 2 3 4 5 ...... W1024 13:10:08.370505 4423 validation.go:71] NodeIP is empty , use default ip which can connect to cloud. I1024 13:10:08.371425 4423 join.go:100] 9. Run EdgeCore daemon I1024 13:10:08.822777 4423 join.go:317] I1024 13:10:08.822789 4423 join.go:318] KubeEdge edgecore is running, For logs visit: journalctl -u edgecore.service -xe
如果通过 sudo systemctl status edgecore发现服务失败,使用 journalctl -u edgecore.service -xe查看日志
在master上查看
1 2 3 4 5 6 root@master:~ NAME STATUS ROLES AGE VERSION edgenode1 Ready agent,edge 10m v1.22.6-kubeedge-v1.12.0 edgenode2 Ready agent,edge 5m8s v1.22.6-kubeedge-v1.12.0 edgenode3 Ready agent,edge 2s v1.22.6-kubeedge-v1.12.0 master Ready control-plane,master 47m v1.22.15
配置kubectl logs支持边缘节点 边缘部署一个Nginx
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 kubectl create deployment nginx --image=nginx -oyaml --dry-run=client > nginx.yaml vi nginx.yaml affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/edge operator: In values: ["" ] kubectl apply -f nginx.yaml (需要首先修改端口号,在~/.kube/config文件中,将端口号修改一下) kubectl expose deployment nginx --port=80 --type =NodePort kubectl get pod,svc
可以看到nginx暴露的端口号为30865,因此我们在任意边缘节点访问地址:curl http://node2:30865可以成功访问nginx首页
这样默认搭建好的kubeedge不支持在master查看边缘节点logs,会有如下的报错
1 2 kubectl logs nginx-597c67fd4d-kx44m Error from server: Get "https://192.168.40.10:10350/containerLogs/default/nginx-597c67fd4d-kx44m/nginx" : dial tcp 192.168.40.10:10350: i/o timeout
参考官方文档教程使用Keadm进行部署 | KubeEdge一个支持边缘计算的开放平台
kube-proxy默认和kubeedge不兼容,因此我们考虑在边缘端移除kube-proxy
1 kubectl edit daemonsets.apps -n kube-system kube-proxy
我们修改kube-proxy的节点亲和性,使其不存在于边缘节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 apiVersion: apps/v1 kind: DaemonSet metadata: annotations: deprecated.daemonset.template.generation: "1" creationTimestamp: "2022-06-20T12:43:20Z" generation: 1 labels: k8s-app: kube-proxy name: kube-proxy namespace: kube-system resourceVersion: "92283" uid: 39dd85f5-8d7f-47ff-83b4-59df66de7803 spec: revisionHistoryLimit: 10 selector: matchLabels: k8s-app: kube-proxy template: metadata: creationTimestamp: null labels: k8s-app: kube-proxy spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/edge operator: DoesNotExist containers: - command: - /usr/local/bin/kube-proxy - --config=/var/lib/kube-proxy/config.conf - --hostname-override=$(NODE_NAME) env: ......
此时我们在云master上看系统的pod组件,发现边缘的kube-proxy已经消失了
1 kubectl get pod -n kube-system -owide
因为flannel不支持边缘环境,无法在边缘运行,等下使用edgemesh见issue2287 ,同样修改flannel的亲和性,使其不在边缘运行
1 2 3 4 5 6 7 8 9 10 11 12 13 kubectl delete daemonset.apps/kube-flannel-ds -nkube-flannel wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml vi kube-flannel.yml ...... - key: node-role.kubernetes.io/edge operator: DoesNotExist ...... kubectl apply -f kube-flannel.yml kubectl get all -nkube-flannel -owide
查看Kubernetes 的 ca.crt 和 ca.key 文件都存在
1 2 3 4 ls /etc/kubernetes/pkiapiserver.crt apiserver.key ca.crt front-proxy-ca.crt front-proxy-client.key apiserver-etcd-client.crt apiserver-kubelet-client.crt ca.key front-proxy-ca.key sa.key apiserver-etcd-client.key apiserver-kubelet-client.key etcd front-proxy-client.crt sa.pub
在云端节点为CloudStream生成证书,其中 certgen.sh在kubeedge源码中
1 2 3 4 5 6 7 8 9 wget https://github.com/kubeedge/kubeedge/archive/refs/tags/v1.13.1.tar.gz tar -xf v1.13.1.tar.gz mkdir /etc/kubeedgecp kubeedge-1.13.1/build/tools/certgen.sh /etc/kubeedge/cd /etc/kubeedge/sudo su export CLOUDCOREIPS="192.168.43.118" bash /etc/kubeedge/certgen.sh stream
在master设置iptables规则,把所有发往边缘edgecore10350的包全部转发给cloudcore,让edgecore通过stream来转发
1 2 sudo iptables -t nat -A OUTPUT -p tcp --dport 10350 -j DNAT --to 10.0.1.200:10003
修改边缘端edgecore配置文件 /etc/kubeedge/config/edgecore.yaml
1 sudo gedit /etc/kubeedge/config/edgecore.yaml
1 2 3 4 5 6 7 8 9 edgeStream: enable: true handshakeTimeout: 30 readDeadline: 15 server: 101.201 .181 .239 :10004 tlsTunnelCAFile: /etc/kubeedge/ca/rootCA.crt tlsTunnelCertFile: /etc/kubeedge/certs/server.crt tlsTunnelPrivateKeyFile: /etc/kubeedge/certs/server.key writeDeadline: 15
重启edgecore
1 sudo systemctl restart edgecore
然后边缘的edgecore就可以正常启动了
1 2 3 4 5 6 7 8 9 10 11 sudo systemctl start edgecore sudo systemctl status edgecore ● edgecore.service Loaded: loaded (/etc/systemd/system/edgecore.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2022-07-07 16:53:51 CST; 4s ago Main PID: 58760 (edgecore) Tasks: 10 (limit : 992) Memory: 36.5M CPU: 315ms CGroup: /system.slice/edgecore.service └─58760 /usr/local/bin/edgecore
现在就可以正常在云端看边缘的logs了
1 2 3 4 kubectl logs nginx-597c67fd4d-hwmdz /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ ......
安装EdgeMesh 在边缘不使用kube-proxy以后,边缘需要一个网络代理插件,这里我们使用kubeedge官方的edgemesh
EdgeMesh 相当于kube-proxy+flannel+coreDNS
edgemesh现在从kubeedge独立出来了,有自己专门的文档网站介绍 | EdgeMesh ,根据文档edgemesh是一个不依赖kubeedge的独立的k8s网络代理,只需要在master节点上使用helm安装就可以了
去除 K8s master 节点的污点,如果 K8s master 节点上没有部署需要被代理的应用,此步骤也可以不执行
1 kubectl taint nodes --all node-role.kubernetes.io/master-
正常情况下你不会希望 EdgeMesh 去代理 Kubernetes API 服务,因此需要给它添加过滤标签,更多信息请参考 服务过滤
1 kubectl label services kubernetes service.edgemesh.kubeedge.io/service-proxy-name=""
启用 KubeEdge 的边缘 Kube-API 端点服务
在云端,开启 dynamicController 模块,配置完成后,需要重启 cloudcore
1 2 3 4 5 6 7 8 9 10 kubectl edit cm cloudcore -n kubeedge modules: ... dynamicController: enable : true ... kubectl get all -nkubeedge kubectl delete -nkubeedge pod/cloudcore-6687684d4d-92cvz
在边缘节点,打开 metaServer 模块,配置完成后,需要重启 edgecore
1 2 3 4 5 6 7 8 9 sudo gedit /etc/kubeedge/config/edgecore.yaml modules: ... metaManager: metaServer: enable : true ... sudo systemctl restart edgecore
在边缘节点,配置 clusterDNS 和 clusterDomain,配置完成后,需要重启 edgecore(这是为了边缘应用能够访问到 EdgeMesh 的 DNS 服务,与边缘 Kube-API 端点本身无关,但为了配置的流畅性,还是放在这里说明。
clusterDNS 设置的值 ‘169.254.96.16’ 来自于 commonConfig 中 bridgeDeviceIP 的默认值,正常情况下无需修改,非得修改请保持两者一致。)
1 2 3 4 5 6 7 8 9 10 11 12 13 sudo gedit /etc/kubeedge/config/edgecore.yaml modules: ... edged: ... tailoredKubeletConfig: ... clusterDNS: - 169.254.96.16 clusterDomain: cluster.local ... sudo systemctl restart edgecore
最后,在边缘节点,测试边缘 Kube-API 端点功能是否正常
1 2 3 curl 127.0.0.1:10550/api/v1/services {"apiVersion" :"v1" ,"items" :[{"api......
主机安装helm3
1 2 3 wget https://get.helm.sh/helm-v3.11.0-linux-amd64.tar.gz tar -xf helm-v3.11.0-linux-amd64.tar.gz sudo mv linux-amd64/helm /usr/local/bin/
设置helm repo仓库,这里使用微软仓库
1 2 helm repo add stable http://mirror.azure.cn/kubernetes/charts helm repo update
生成PSK密码
1 2 openssl rand -base64 32 JDhvPrqj/mA/2zA4P9voxqQIR8ectRzY8pDKaD+vlHo=
helm安装edgemesh,只有一个master节点作为中继节点,暴露地址写云主机公网IP
1 2 3 4 helm install edgemesh --namespace kubeedge \ --set agent.psk=JDhvPrqj/mA/2zA4P9voxqQIR8ectRzY8pDKaD+vlHo= \ --set agent.relayNodes[0].nodeName=master,agent.relayNodes[0].advertiseAddress="{192.168.80.128}" \ https://raw.githubusercontent.com/kubeedge/edgemesh/main/build/helm/edgemesh.tgz
多个中继节点,如果一个边缘局域网内要做到服务可访问,应该在局域网内设置一个rely节点(似乎不用)
1 2 3 4 5 6 helm install edgemesh --namespace kubeedge \ --set agent.psk=udj41ZTdaQNb0gUaS64QuLgkFNTYy9dlXKg6bvQYuls= \ --set agent.relayNodes[0].nodeName=master,agent.relayNodes[0].advertiseAddress="{101.201.181.239}" \ --set agent.relayNodes[1].nodeName=edgenode2,agent.relayNodes[1].advertiseAddress="{192.168.56.11}" \ --set agent.relayNodes[2].nodeName=edgenode3,agent.relayNodes[2].advertiseAddress="{192.168.56.12}" \ https://raw.githubusercontent.com/kubeedge/edgemesh/main/build/helm/edgemesh.tgz
卸载edgemesh方法
1 helm uninstall edgemesh -n kubeedge
检验部署结果
1 2 3 helm ls -A NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION edgemesh kubeedge 1 2022-10-08 22:36:18.261721438 +0800 CST deployed edgemesh-0.1.0 latest
再次检验
1 2 3 4 5 6 7 8 9 kubectl get all -n kubeedge -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/cloudcore-5768d46f8d-t8dnn 1/1 Running 1 (50m ago) 159m 172.28.40.134 master <none> <none> pod/edgemesh-agent-4f5xt 0/1 ContainerCreating 0 4m8s 192.168.40.10 area1-node1 <none> <none> pod/edgemesh-agent-6czts 0/1 CrashLoopBackOff 5 (27s ago) 4m8s 172.28.40.134 master <none> <none> pod/edgemesh-agent-krvsc 0/1 Pending 0 4m8s <none> area2-node1 <none> <none> pod/edgemesh-agent-p22b5 0/1 Pending 0 4m8s <none> area2-node2 <none> <none> pod/edgemesh-agent-tzq7g 0/1 Pending 0 4m8s <none> area1-node2 <none> <none> ......
云主机上的edgemesh启动崩溃,边缘上的edgemesh一直在下载镜像,创建容器和pending,可以看看是不是edgemesh镜像不是最新的造成的
等了一晚上,终于edgemesh镜像拉取完毕,在边缘运行起来了。
因为helm部署的edgemesh在重启后不会恢复,因此我们手动部署edgemesh
1 2 3 4 5 6 7 git clone https://github.com/kubeedge/edgemesh.git cd edgemeshkubectl apply -f build/crds/istio/ kubectl apply -f build/agent/resources/ kubectl get all -n kubeedge -o wide
性能测试 查看监听端口
1 2 sudo lsof -i -P -n | grep LISTEN sudo netstat -tulpn | grep LISTEN
1 2 3 4 ip.addr == 192.168.80.128 arp or mdns or ssdp or (ip.src == 192.168.0.0/24 and ip.dst == 192.168.0.0/24)
健康检查 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #!/bin/bash while true do res=$(curl http://10.96.0.28:8091/scheduler |grep 0.000) if [ -z "$res " ];then echo "得到正确的功率值" echo -e "\a" break fi sleep 1 done curl http://10.96.0.28:8091/scheduler time ./test.sh
统计离线感知的准确度 部署nginx应用,使每个应用均匀分布在每个节点上
断开一个运行有应用的节点的云边连接,查看控制平面上该节点上的Pod是否会迁移
1 2 3 4 5 6 7 8 sudo iptables -A FORWARD -i enp0s8 -s 192.168.56.10 -o enp0s3 -j DROP sudo iptables -A FORWARD -i enp0s8 -s 192.168.56.10 -o enp0s3 -j ACCEPT sudo iptables -L -n --line-number sudo iptables -D FORWARD 1
关闭一个运行有应用的节点电源,查看控制平面上该节点上的Pod是否会迁移
重连同步性能 边缘断网以后,等待节点not ready被打上污点,然后使用nethogs记录apiserver的流量情况,然后边缘通网,等节点ready后10秒结束流量记录,每秒流量总和即为重连同步的数据
1 2 3 sudo apt install nethogs nethogs -b|grep kube-apiserver >mon.txt
Ubuntu20.04 TLS 开机卡在“A start job is running for wait for network to be Configured”解决
1 2 3 4 5 6 7 sudo vi /etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service [Service] Type=oneshot ExecStart=/lib/systemd/systemd-networkd-wait-online RemainAfterExit=yes TimeoutStartSec=2sec
修改pod上限
1 2 3 4 5 6 7 8 9 10 11 12 sudo vi /etc/kubeedge/config/edgecore.yaml ... maxPods: 500 sudo vi /var/lib/kubelet/config.yaml ... maxPods: 500 sudo systemctl restart kubelet sudo systemctl restart edgecore kubectl describe node node1 | grep -i "Capacity\|Allocatable" -A 6
安装Prometheuses