44 Lifecycle actions #
This section covers the lifecycle management actions for clusters deployed via SUSE Telco Cloud.
44.1 Load Balancer Exclusion #
There are many lifecycle actions that require nodes to be drained. During the draining process, all pods will be moved to other nodes in the cluster. After the draining process is finished, the node does not host any services and therefore should not have any traffic routed to it. Load balancers, such as MetalLB, can be made aware of this by applying a label to the node:
node.kubernetes.io/exclude-from-external-load-balancers: "true"For more details see: Kubernetes Documentation.
To see the labels on all your nodes in a cluster, you can run:
kubectl get nodes -o json | jq -r '.items[].metadata | .name, .labels'
In the case of upgrades of downstream clusters, this can be automated by annotating the RKE2ControlPlane on the management cluster:
rke2.controlplane.cluster.x-k8s.io/load-balancer-exclusion="true"This immediately creates an annotation on all machine objects on the management cluster for that RKE2ControlPlane.
pre-drain.delete.hook.machine.cluster.x-k8s.io/rke2-lb-exclusion: ""With this annotation on the machine objects, any node on the downstream cluster that is scheduled for draining will get the above node label attached prior to the start of the draining process. The label will be removed from the node once it is available and ready again.
44.2 Management cluster upgrades #
The upgrade of the management cluster is described in the Day 2 management cluster (Chapter 36, Management Cluster) documentation.
44.3 Downstream cluster upgrades #
Upgrading downstream clusters involves updating several components. The following sections cover the upgrade process for each of the components.
Upgrading the operating system
For this process, check the following reference (Section 43.2, “Prepare downstream cluster image for connected scenarios”) to build the new image with a new operating system version.
With this new image generated by EIB, the next provision phase uses the new operating version provided.
In the following step, the new image is used to upgrade the nodes.
Upgrading the RKE2 cluster
The changes required to upgrade the RKE2 cluster using the automated workflow are the following:
Change the block
RKE2ControlPlanein thecapi-provisioning-example.yamlshown in the following section (Section 43.4, “Downstream cluster provisioning with Directed network provisioning (single-node)”):Specify the desired
rolloutStrategy.Change the version of the
RKE2cluster to the new version replacing${RKE2_NEW_VERSION}.
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: RKE2ControlPlane
metadata:
name: single-node-cluster
namespace: default
spec:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
name: single-node-cluster-controlplane
version: ${RKE2_NEW_VERSION}
replicas: 1
rolloutStrategy:
type: "RollingUpdate"
rollingUpdate:
maxSurge: 0
serverConfig:
cni: cilium
rolloutStrategy:
rollingUpdate:
maxSurge: 0
registrationMethod: "control-plane-endpoint"
agentConfig:
format: ignition
additionalUserData:
config: |
variant: fcos
version: 1.4.0
systemd:
units:
- name: rke2-preinstall.service
enabled: true
contents: |
[Unit]
Description=rke2-preinstall
Wants=network-online.target
Before=rke2-install.service
ConditionPathExists=!/run/cluster-api/bootstrap-success.complete
[Service]
Type=oneshot
User=root
ExecStartPre=/bin/sh -c "mount -L config-2 /mnt"
ExecStart=/bin/sh -c "sed -i \"s/BAREMETALHOST_UUID/$(jq -r .uuid /mnt/openstack/latest/meta_data.json)/\" /etc/rancher/rke2/config.yaml"
ExecStart=/bin/sh -c "echo \"node-name: $(jq -r .name /mnt/openstack/latest/meta_data.json)\" >> /etc/rancher/rke2/config.yaml"
ExecStartPost=/bin/sh -c "umount /mnt"
[Install]
WantedBy=multi-user.target
kubelet:
extraArgs:
- provider-id=metal3://BAREMETALHOST_UUID
nodeName: "localhost.localdomain"Change the block
Metal3MachineTemplatein thecapi-provisioning-example.yamlshown in the following section (Section 43.4, “Downstream cluster provisioning with Directed network provisioning (single-node)”):Change the image name and checksum to the new version generated in the previous step.
Add the directive
nodeReusetotrueto avoid creating a new node.Add the directive
automatedCleaningModetometadatato enable the automated cleaning for the node.
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
metadata:
name: single-node-cluster-controlplane
namespace: default
spec:
nodeReuse: True
template:
spec:
automatedCleaningMode: metadata
dataTemplate:
name: single-node-cluster-controlplane-template
hostSelector:
matchLabels:
cluster-role: control-plane
image:
checksum: http://imagecache.local:8080/${NEW_IMAGE_GENERATED}.sha256
checksumType: sha256
format: raw
url: http://imagecache.local:8080/${NEW_IMAGE_GENERATED}.rawBefore applying the capi-provisioning-example.yaml file, it is always a good
practice to inform external load balancers (e.g. MetalLB) about nodes being
drained so that they do not route traffic to nodes in this state. As mentioned
in the Section 44.1, “Load Balancer Exclusion” section, you can automate this by annotating
the RKE2ControlPlane on the management cluster. In this example, an
RKE2ControlPlane object called multinode-cluster is annotated:
kubectl annotate RKE2ControlPlane/multinode-cluster rke2.controlplane.cluster.x-k8s.io/load-balancer-exclusion="true"
Verify that the machine objects have been annotated:
pre-drain.delete.hook.machine.cluster.x-k8s.io/rke2-lb-exclusion: ""Fetch the annotations for all your machine objects:
kubectl get machines -o json | jq -r '.items[].metadata | .name, .annotations'
Without these annotations users might experience longer response times for services as the load-balancers are unaware of drained nodes.
After making these changes, the capi-provisioning-example.yaml file can be applied to the cluster using the following command:
kubectl apply -f capi-provisioning-example.yaml