Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
SUSE Edge Documentation / SUSE Telco Cloud Documentation / Lifecycle actions

44 Lifecycle actions

This section covers the lifecycle management actions for clusters deployed via SUSE Telco Cloud.

44.1 Load Balancer Exclusion

There are many lifecycle actions that require nodes to be drained. During the draining process, all pods will be moved to other nodes in the cluster. After the draining process is finished, the node does not host any services and therefore should not have any traffic routed to it. Load balancers, such as MetalLB, can be made aware of this by applying a label to the node:

node.kubernetes.io/exclude-from-external-load-balancers: "true"

For more details see: Kubernetes Documentation.

To see the labels on all your nodes in a cluster, you can run:

kubectl get nodes -o json | jq -r '.items[].metadata | .name, .labels'

In the case of upgrades of downstream clusters, this can be automated by annotating the RKE2ControlPlane on the management cluster:

rke2.controlplane.cluster.x-k8s.io/load-balancer-exclusion="true"

This immediately creates an annotation on all machine objects on the management cluster for that RKE2ControlPlane.

pre-drain.delete.hook.machine.cluster.x-k8s.io/rke2-lb-exclusion: ""

With this annotation on the machine objects, any node on the downstream cluster that is scheduled for draining will get the above node label attached prior to the start of the draining process. The label will be removed from the node once it is available and ready again.

44.2 Management cluster upgrades

The upgrade of the management cluster is described in the Day 2 management cluster (Chapter 36, Management Cluster) documentation.

44.3 Downstream cluster upgrades

Upgrading downstream clusters involves updating several components. The following sections cover the upgrade process for each of the components.

Upgrading the operating system

For this process, check the following reference (Section 43.2, “Prepare downstream cluster image for connected scenarios”) to build the new image with a new operating system version. With this new image generated by EIB, the next provision phase uses the new operating version provided. In the following step, the new image is used to upgrade the nodes.

Upgrading the RKE2 cluster

The changes required to upgrade the RKE2 cluster using the automated workflow are the following:

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: RKE2ControlPlane
metadata:
  name: single-node-cluster
  namespace: default
spec:
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: Metal3MachineTemplate
    name: single-node-cluster-controlplane
  version: ${RKE2_NEW_VERSION}
  replicas: 1
  rolloutStrategy:
    type: "RollingUpdate"
    rollingUpdate:
      maxSurge: 0
  serverConfig:
    cni: cilium
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 0
  registrationMethod: "control-plane-endpoint"
  agentConfig:
    format: ignition
    additionalUserData:
      config: |
        variant: fcos
        version: 1.4.0
        systemd:
          units:
            - name: rke2-preinstall.service
              enabled: true
              contents: |
                [Unit]
                Description=rke2-preinstall
                Wants=network-online.target
                Before=rke2-install.service
                ConditionPathExists=!/run/cluster-api/bootstrap-success.complete
                [Service]
                Type=oneshot
                User=root
                ExecStartPre=/bin/sh -c "mount -L config-2 /mnt"
                ExecStart=/bin/sh -c "sed -i \"s/BAREMETALHOST_UUID/$(jq -r .uuid /mnt/openstack/latest/meta_data.json)/\" /etc/rancher/rke2/config.yaml"
                ExecStart=/bin/sh -c "echo \"node-name: $(jq -r .name /mnt/openstack/latest/meta_data.json)\" >> /etc/rancher/rke2/config.yaml"
                ExecStartPost=/bin/sh -c "umount /mnt"
                [Install]
                WantedBy=multi-user.target
    kubelet:
      extraArgs:
        - provider-id=metal3://BAREMETALHOST_UUID
    nodeName: "localhost.localdomain"
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
metadata:
  name: single-node-cluster-controlplane
  namespace: default
spec:
  nodeReuse: True
  template:
    spec:
      automatedCleaningMode: metadata
      dataTemplate:
        name: single-node-cluster-controlplane-template
      hostSelector:
        matchLabels:
          cluster-role: control-plane
      image:
        checksum: http://imagecache.local:8080/${NEW_IMAGE_GENERATED}.sha256
        checksumType: sha256
        format: raw
        url: http://imagecache.local:8080/${NEW_IMAGE_GENERATED}.raw

Before applying the capi-provisioning-example.yaml file, it is always a good practice to inform external load balancers (e.g. MetalLB) about nodes being drained so that they do not route traffic to nodes in this state. As mentioned in the Section 44.1, “Load Balancer Exclusion” section, you can automate this by annotating the RKE2ControlPlane on the management cluster. In this example, an RKE2ControlPlane object called multinode-cluster is annotated:

kubectl annotate  RKE2ControlPlane/multinode-cluster  rke2.controlplane.cluster.x-k8s.io/load-balancer-exclusion="true"

Verify that the machine objects have been annotated:

pre-drain.delete.hook.machine.cluster.x-k8s.io/rke2-lb-exclusion: ""

Fetch the annotations for all your machine objects:

kubectl get machines -o json | jq -r '.items[].metadata | .name, .annotations'
Note
Note

Without these annotations users might experience longer response times for services as the load-balancers are unaware of drained nodes.

After making these changes, the capi-provisioning-example.yaml file can be applied to the cluster using the following command:

kubectl apply -f capi-provisioning-example.yaml