Split Brain: Decommissioning a control-plane node
Categories: tech
Tags: kubernetes administration
As apart of moving to the next generation of my Kubernetes infrastructure, I was replacing my Raspberry Pi 4
named blueberry
with an N100 box named Gagana
. A while ago I completed my initial setup of Gagana
as a control
plane node via an Ansible playbook. Nothing exploded have 4 control plane node so I had been deferring upgrades.
I had been deferring upgrades since documentation is not particularly clear on how to approach this. Turns out it was easier than I thought. However it did expose hidden problems, misconfigurations, and other issues with my setup, as one might expect.
TLDR: Demoting a node from control-plane
to `worker
Caveat I’ve done this once and will probably change. Cluster is built using kubeadm
with stacked etcd
nodes on
Kubernetes 1.28 .
1.) Update DNS entries for kube-api
. On my setup I use both my pfSense router as backup DNS server and several
unbound
instances. I needed to ensure the old device was replaced with the one for the A
name entries. Confirmed
with dig
against these services to verify they wer updated.
2.) Flush dns
caches on the nodes to take up all nodes within the cluster. nslookup
is a great way to
ensure the cache has been flushed.
3.) Restart kubelet
on all nodes within the cluster. This might be optional. Appeared like some kubelet
instances had cached kube-api
values. For fastest recovery, drain the node first, as kublet
performs recovyer
operations on start.
4.) Remove the control-plane
label. Fairly straight forward operation, informing other nodes and removing any
particular pods you have targeting the control-plane
.
5.) On the node run kubeadm reset etcd-membership
. Informs other etcd
nodes the stack etcd
instance is no
longer a part of the cluster. Andrei Kvapil
has a great article on accessing the etcd
cluster from a host running an instance. Effectively boils down to:
CONTAINER_ID=$(crictl ps -a --label io.kubernetes.container.name=etcd --label io.kubernetes.pod.
namespace=kube-system | awk 'NR>1{r=$1} $0~/Running/{exit} END{print r}')
alias etcdctl='crictl exec "$CONTAINER_ID" etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt'
etcdctl member list -w table
Will output something like this. These are the expected nodes.
+------------------+---------+--------+---------------------------+---------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------+---------------------------+---------------------------+------------+
| a974226525b6568 | started | molly | https://192.168.0.33:2380 | https://192.168.0.33:2379 | false |
| 25189c8d4b4d81cb | started | kal | https://192.168.0.32:2380 | https://192.168.0.32:2379 | false |
| 3c23e3b6b13c8e8f | started | gagana | https://192.168.0.34:2380 | https://192.168.0.34:2379 | false |
+------------------+---------+--------+---------------------------+---------------------------+------------+
If your retiring node still exists you may need instruct etcd
to remove it manually.
6.) Drain the node. Probably not required but I felt more comfortable running doing it so if I had to reset the node.
7.) Under /etc/kubernetes/manifests
remove the static manifests for the following by deleting them: kubelet
will restart these or remove them when changes occur to the files.
etcd.yaml
kube-controller-manager.yaml
kube-apiserver.yaml
kube-scheduler.yaml
8.) Restart kubelet
on the decommissioning node. Everything should just work fine.
Revealing other problems
As one would expect, this change revealed other problems within my Kubernetes configuration. In addition to a number of
core packages which need additional updates I also realized I still use nfs
for two different storage classes. Resulting
in some pods being knocked off line as my several did not have tools. I’ve adopted the label dfs.meschbach.com/nfs
,
setting it to true
when the tools are installed.