I'm trying to setup a small 4 worker node cluster atm and I just installed k3s on my raspberry pi 4s (8gb) and I'm getting a NotReady
status. I'm new to kubernettes/k3s, but I believe with a totally fresh install, things should 'just work'. I have a fresh wipe and install of Ubuntu 22.04 server for 64 bit arm. Since the terminal output is so long, I have a pastbin here. It looks like the pods on the master are failing to mount volumes and failures to make a sandbox. Also I'm having apiserver issues, which I think is related to these mounting and sandbox errors as after several tries the apiserver will eventually respond. So I guess, wtf is going on. Can anyone help make sense of this? Why is my master node struggling to mount volumes? How do I even begin to fix this?
zeus@atlas00:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
atlas04 NotReady <none> 7h32m v1.23.6+k3s1
atlas08 NotReady <none> 7h36m v1.23.6+k3s1
atlas06 NotReady <none> 7h36m v1.23.6+k3s1
atlas02 Ready <none> 7h32m v1.23.6+k3s1
atlas00 NotReady control-plane,master 8h v1.23.6+k3s1
zeus@atlas00:~$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
helm-install-traefik-qzxlm 0/1 ContainerCreating 0 8h <none> atlas00 <none> <none>
local-path-provisioner-6c79684f77-bb9bn 0/1 Pending 0 8h <none> <none> <none> <none>
helm-install-traefik-crd-tg52k 0/1 ContainerCreating 0 8h <none> atlas00 <none> <none>
metrics-server-7cd5fcb6b7-qz88k 0/1 Pending 0 8h <none> <none> <none> <none>
coredns-d76bd69b-9dzpc 0/1 ContainerCreating 0 8h <none> atlas00 <none> <none>
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
Error from server (InternalError): an error on the server ("apiserver not ready") has prevented the request from succeeding (get pods helm-install-traefik-qzxlm)
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
Name: helm-install-traefik-qzxlm
Namespace: kube-system
Priority: 0
Node: atlas00/192.168.1.50
Start Time: Tue, 24 May 2022 08:07:56 +0000
Labels: controller-uid=1f431fba-cb3a-45cc-880a-5be734db988e
helmcharts.helm.cattle.io/chart=traefik
job-name=helm-install-traefik
Annotations: helmcharts.helm.cattle.io/configHash: SHA256=8BE6F0CEB108C2A3A1EC5A8F7591596C00670380ACEA294775E4769C94AEE7A2
Status: Pending
IP:
IPs: <none>
Controlled By: Job/helm-install-traefik
Containers:
helm:
Container ID:
Image: rancher/klipper-helm:v0.7.1-build20220407
Image ID:
Port: <none>
Host Port: <none>
Args:
install
--set-string
global.systemDefaultRegistry=
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
NAME: traefik
VERSION:
REPO:
HELM_DRIVER: secret
CHART_NAMESPACE: kube-system
CHART: https://%{KUBERNETES_API}%/static/charts/traefik-10.19.300.tgz
HELM_VERSION:
TARGET_NAMESPACE: kube-system
NO_PROXY: .svc,.cluster.local,10.42.0.0/16,10.43.0.0/16
FAILURE_POLICY: reinstall
Mounts:
/chart from content (rw)
/config from values (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-f9qlx (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
values:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: chart-values-traefik
Optional: false
content:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: chart-content-traefik
Optional: false
kube-api-access-f9qlx:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8h default-scheduler Successfully assigned kube-system/helm-install-traefik-qzxlm to atlas00
Warning FailedMount 8h kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 8h kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to fetch token: serviceaccounts "helm-traefik" is forbidden: User "system:node:atlas00" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system": no relationship found between node 'atlas00' and this object
Warning FailedMount 8h (x2 over 8h) kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h53m kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h53m kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h53m (x2 over 7h53m) kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h52m kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h52m (x2 over 7h52m) kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h52m kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to fetch token: serviceaccounts "helm-traefik" is forbidden: User "system:node:atlas00" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system": no relationship found between node 'atlas00' and this object
Warning FailedMount 7h41m kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h41m kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h41m kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to fetch token: serviceaccounts "helm-traefik" is forbidden: User "system:node:atlas00" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system": no relationship found between node 'atlas00' and this object
Warning FailedCreatePodSandBox 7h40m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to prepare extraction snapshot "extract-476722526-09RL sha256:c640e628658788773e4478ae837822c9bc7db5b512442f54286a98ad50f88fd4": failed to rename: rename /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/new-2732139020 /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/4: file exists
Warning FailedCreatePodSandBox 6h54m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b9f0346aa924105c7c3498ecb6315c32e13d4237eaa062cea2926401ba1c0ab6": plugin type="flannel" failed (add): open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 6h42m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "41b66aa473ffaee3ae32567c0ff2fe233f35569ea15b3301cfab127e92efce69": plugin type="flannel" failed (add): open /run/flannel/subnet.env: no such file or directory