MicroK8s - Recover from lost quorum

By default, a three-node MicroK8s cluster automatically becomes highly available (HA). In HA mode the default datastore (dqlite) implements a Raft based protocol where an elected leader holds the definitive copy of the database. Under normal operation copies of the database are maintained by two more nodes. If you permanently lose the majority of the cluster members that serve as database nodes (for example, if you have a three-node cluster and you lose two of them), the cluster will become unavailable. However, if at least one database node has survived, you will be able to recover the cluster with the following manual steps.

Note: that the following recovery process applies to clusters using the default (dqlite) datastore of MicroK8s only. This process does not recover any data you have in PVs on the lost nodes.

Stop dqlite on all nodes

Stopping MicroK8s is done with:

microk8s stop

You must also make sure the lost nodes that used to form the cluster will not come back alive again. Any lost nodes that can be reinstated will have to re-join the cluster with the microk8s add-node and microk8s join process (see the documentation on clusters).

Backup the database

Dqlite stores data and configuration files under /var/snap/microk8s/current/var/kubernetes/backend/. To make a safe copy of the current state log in to a surviving node and create tarball of the dqlite directory:

tar -cvf backup.tar /var/snap/microk8s/current/var/kubernetes/backend

Set the new state of the database cluster

Under /var/snap/microk8s/current/var/kubernetes/backend the file cluster.yaml reflects the state of the cluster as dqlite sees it. Edit this file to remove the lost nodes leaving only the ones available. For example, let’s assume a three-node cluster with nodes 10.211.205.12, 10.211.205.253 and 10.211.205.221 where 10.211.205.12 and 10.211.205.221 are lost. In this case the cluster.yaml will look like this:

- Address: 10.211.205.122:19001
  ID: 3297041220608546238
  Role: 0
- Address: 10.211.205.253:19001
  ID: 9373968242441247628
  Role: 0
- Address: 10.211.205.221:19001
  ID: 3349965773726029294
  Role: 0

By removing the lost nodes 10.211.205.122 and 10.211.205.221 the cluster.yaml should be left with only the 10.211.205.253 entry. For this example, we convert to a single node. You may choose to include more nodes in the new cluster.

- Address: 10.211.205.253:19001
  ID: 9373968242441247628
  Role: 0

Reconfigure dqlite

MicroK8s comes with a dqlite client utility for node reconfiguration.

The command to run is:

sudo /snap/microk8s/current/bin/dqlite \
  -s 127.0.0.1:19001 \
  -c /var/snap/microk8s/current/var/kubernetes/backend/cluster.crt \
  -k /var/snap/microk8s/current/var/kubernetes/backend/cluster.key \
  k8s ".reconfigure /var/snap/microk8s/current/var/kubernetes/backend/ /var/snap/microk8s/current/var/kubernetes/backend/cluster.yaml"

The /snap/microk8s/current/bin/dqlite utility needs to be called with sudo and takes the following arguments:

the endpoint to the (now stopped) dqlite service. We have used -s 127.0.0.1:19001 for this endpoint in the example above.
the private and public keys needed to access the database. These keys are passed with the -c and -k arguments and are found in the directory where dqlite keeps the database.
the name of the database. For MicroK8s the database is k8s.
the operation to be performed in this case is “reconfigure”
the path to the database we want to reconfigure is the current database under /var/snap/microk8s/current/var/kubernetes/backend
the end cluster configuration we want to recreate is reflected in the cluster.yaml we edited in the previous step.

Update the rest of the cluster nodes

Copy the cluster.yaml, snapshot-abc-abc-abc, snapshot-abc-abc-abc.meta and segment files 00000abcxx-00000abcxx) from the node where you ran the reconfigure command in the previous step on all other nodes mentioned in the cluster.yaml file.

Further, on all nodes create an info.yaml that is in line with the info.yaml file you created previously.

WARNING: Make sure to delete any leftover snapshot-abc-abc-abc, snapshot-abc-abc-abc.meta, segment (00000abcxx-000000abcxx, open-abc) and metadata{1,2} files that it contains. This is important otherwise the nodes will fail to cleanly rejoin the node.

Restart MicroK8s services

It should now be possible to bring the cluster back online with:

microk8s start

Cleanup the lost nodes from Kubernetes

The lost nodes are registered in Kubernetes but should be reporting as NotReady in:

microk8s kubectl get no

To remove the lost nodes use:

microk8s remove-node <node name>

Restore HA

High availability will be reattained when there are three or more nodes in the MicroK8s cluster. If the original failed nodes have been revived, or new nodes created, these can be joined to the cluster to restore high availability. See the documentation on clusters for instructions on adding nodes.

References

GitHub issue

Last updated 3 years ago. Help improve this document in the forum.