This is a draft document that was built and uploaded automatically. It may document beta software and be incomplete or even incorrect. Use this document at your own risk.

Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
SUSE Telco Cloud Documentation|Day 2 Operations|Lifecycle actions

59 Lifecycle actions

This section covers the lifecycle management actions for clusters deployed via SUSE Telco Cloud.

59.1 Load Balancer Exclusion

There are many lifecycle actions that require nodes to be drained. During the draining process, all pods will be moved to other nodes in the cluster. After the draining process is finished, the node does not host any services and therefore should not have any traffic routed to it. Load balancers, such as MetalLB, can be made aware of this by applying a label to the node:

node.kubernetes.io/exclude-from-external-load-balancers: "true"

For more details see: Kubernetes Documentation.

To see the labels on all your nodes in a cluster, you can run:

kubectl get nodes -o json | jq -r '.items[].metadata | .name, .labels'

In the case of upgrades of downstream clusters, this can be automated by annotating the RKE2ControlPlane on the management cluster:

rke2.controlplane.cluster.x-k8s.io/load-balancer-exclusion="true"

This immediately creates an annotation on all machine objects on the management cluster for that RKE2ControlPlane.

pre-drain.delete.hook.machine.cluster.x-k8s.io/rke2-lb-exclusion: ""

With this annotation on the machine objects, any node on the downstream cluster that is scheduled for draining will get the above node label attached prior to the start of the draining process. The label will be removed from the node once it is available and ready again.

59.2 Management cluster upgrades

The upgrade of the management cluster is described in the Day 2 management cluster (Chapter 58, Management Cluster) documentation.

59.3 Downstream cluster upgrades

Upgrading downstream clusters involves updating several components. The following sections cover the upgrade process for each of the components.

Upgrading the operating system

For this process, check the following reference (Chapter 49, Prepare downstream cluster image for connected scenarios) to build the new image with a new operating system version. With this new image generated by EIB, the next provision phase uses the new operating version provided. In the following step, the new image is used to upgrade the nodes.

Upgrading the RKE2 cluster

The changes required to upgrade the RKE2 cluster using the automated workflow are the following:

  • Change the block RKE2ControlPlane in the capi-provisioning-example.yaml shown in the following section (Chapter 51, Downstream cluster provisioning with Directed network provisioning (single-node)):

    • Specify the desired rolloutStrategy.

    • Change the version of the RKE2 cluster to the new version replacing ${RKE2_NEW_VERSION}.

    • Decide if an ingress controller is to be deployed in the downstream cluster:

      • [Option 0]: Do not deploy any ingress controller

      • [Option 1]: Deploy only Traefik

      • [Option 2]: Deploy both Ingress-NGINX and Traefik (to be used for complex ingress migration scenarios)

Note
Note

The Traefik ingress provider integrated into RKE2/K3s is the only ingress controller supported in SUSE Telco Cloud 3.6 release, being still possible to temporarily run Ingress-NGINX alongside Traefik in order to support complex ingress migration scenarios, but only after SUSE Telco Cloud Management and/or Downstream clusters have been upgraded to version 3.6 and for the time required to perform that migration. Since Traefik is not yet the default ingress controller in RKE2 (it will be from RKE2 v1.36 onwards), it must be explicitly "requested" from the RKE2 server configuration file.

RKE2 Ingress NGINX to Traefik Migration guide provides details on the ingress migration paths available once the Traefik ingress controller replaces the discontinued Ingress-NGINX.

apiVersion: controlplane.cluster.x-k8s.io/v1beta2
kind: RKE2ControlPlane
metadata:
  name: single-node-cluster
  namespace: default
spec:
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
    kind: Metal3MachineTemplate
    name: single-node-cluster-controlplane
  version: ${RKE2_NEW_VERSION}
  replicas: 1
  rolloutStrategy:
    type: "RollingUpdate"
    rollingUpdate:
      maxSurge: 0
  serverConfig:
    cni: cilium
  #===========================================================================
  # Uncomment the following lines if selecting [Option 0]: Do not deploy
  # any ingress controller
  #===========================================================================
    #disableComponents:
    #  pluginComponents:
    #  - "rke2-ingress-nginx"
  #---------------------------------------------------------------------------
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 0
  registrationMethod: "control-plane-endpoint"
  agentConfig:
    format: ignition
    additionalUserData:
      config: |
        variant: fcos
        version: 1.4.0
        systemd:
          units:
          - name: rke2-preinstall.service
            enabled: true
            contents: |
              [Unit]
              Description=rke2-preinstall
              Wants=network-online.target
              Before=rke2-install.service
              ConditionPathExists=!/run/cluster-api/bootstrap-success.complete
              [Service]
              Type=oneshot
              User=root
              ExecStartPre=/bin/sh -c "mount -L config-2 /mnt"
              ExecStart=/bin/sh -c "sed -i \"s/BAREMETALHOST_UUID/$(jq -r .uuid /mnt/openstack/latest/meta_data.json)/\" /etc/rancher/rke2/config.yaml"
              ExecStart=/bin/sh -c "echo \"node-name: $(jq -r .name /mnt/openstack/latest/meta_data.json)\" >> /etc/rancher/rke2/config.yaml"
              ExecStart=/bin/sh -c "echo \"node-label:\" >> /etc/rancher/rke2/config.yaml"
              ExecStart=/bin/sh -c "echo \"  - metal3.io/uuid=$(jq -r .uuid /mnt/openstack/latest/meta_data.json)\" >> /etc/rancher/rke2/config.yaml"
              ExecStartPost=/bin/sh -c "umount /mnt"
              [Install]
              WantedBy=multi-user.target
          # rke2-ingress-deployment.service unit
          - name: rke2-ingress-deployment.service
            enabled: true
            contents: |
              [Unit]
              Description=rke2-ingress-deployment
              Wants=rke2-preinstall.service
              Before=rke2-install.service
              ConditionPathExists=!/run/cluster-api/bootstrap-success.complete
              [Service]
              Type=oneshot
              User=root
              #===============================================================================================================================
              # Leave one (and only one) of the two following ExecStart lines uncommented, depending on the desired ingress-controller(s):
              #   [Option 1]: Deploy only "Traefik"
              #   [Option 2]: Deploy both "Ingress-NGINX" and "Traefik"
              #
              # Keep both commented ONLY in case of seleting [Option 0]: "Do not deploy any ingress controller"
              #===============================================================================================================================
              #ExecStart=/bin/sh -c "echo \"ingress-controller: traefik\" >> /etc/rancher/rke2/config.yaml"                       # [Option 1]
              ExecStart=/bin/sh -c "echo -e \"ingress-controller:\n- ingress-nginx\n- traefik\" >> /etc/rancher/rke2/config.yaml" # [Option 2]
              #-------------------------------------------------------------------------------------------------------------------------------
              [Install]
              WantedBy=multi-user.target
        storage:
          directories:
          - path: /var/lib/rancher/rke2/server/manifests
            overwrite: true
          files:
          #############################################################################
          # if [Option 2]: "Deploy both `Ingress-NGINX` and `Traefik`" is selected
          #############################################################################
          - path: /var/lib/rancher/rke2/server/manifests/rke2-ingress-nginx-config.yaml
            overwrite: true
            contents:
              inline: |
                apiVersion: helm.cattle.io/v1
                kind: HelmChartConfig
                metadata:
                  name: rke2-ingress-nginx
                  namespace: kube-system
                spec:
                  valuesContent: |-
                    controller:
                      hostPort:
                        enabled: false  # not needed when exposing through a type:LoadBalancer service
                      config:
                        use-forwarded-headers: "true"
                        enable-real-ip: "true"
                      publishService:
                        enabled: true
                      service:
                        enabled: true
                        type: LoadBalancer
                        externalTrafficPolicy: Local
            mode: 0644
            user:
              name: root
            group:
              name: root
          #############################################################################
          # if [Option 1]: "Deploy only `Traefik`" OR  [Option 2]: "Deploy both
          #`Ingress-NGINX` and `Traefik`" is selected
          #############################################################################
          - path: /var/lib/rancher/rke2/server/manifests/rke2-traefik-config.yaml
            overwrite: true
            contents:
              inline: |
                apiVersion: helm.cattle.io/v1
                kind: HelmChartConfig
                metadata:
                  name: rke2-traefik
                  namespace: kube-system
                spec:
                  valuesContent: |-
                    ingressClass:
                      isDefaultClass: false  # Assumes [Option 2]; set to true if [Option 1]: "only deploying `Traefik`"
                    ports:
                      web:
                        hostPort: null    # disallow hostPort
                        exposedPort: 80
                      websecure:
                        hostPort: null    # disallow hostPort
                        exposedPort: 443
                    service:
                      enabled: true
                      type: LoadBalancer
                      spec:
                        externalTrafficPolicy: Local
                        allocateLoadBalancerNodePorts: false  # k8s GA from 1.24; supported by MetalLB
                    providers:
                      kubernetesIngressNginx:  # this provider allows Traefik to "understand" most of the Ingress-NGINX annotations
                        enabled: true
                        ingressClass: "rke2-ingress-nginx-migration"
                        controllerClass: "rke2.​cattle.​io/ingress-nginx-migration"
            mode: 0644
            user:
              name: root
            group:
              name: root
    kubelet:
      extraArgs:
      - provider-id=metal3://BAREMETALHOST_UUID
    nodeName: "localhost.localdomain"
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: Metal3MachineTemplate
metadata:
  name: single-node-cluster-controlplane
  namespace: default
spec:
  nodeReuse: True
  template:
    spec:
      automatedCleaningMode: metadata
      dataTemplate:
        name: single-node-cluster-controlplane-template
      hostSelector:
        matchLabels:
          cluster-role: control-plane
      image:
        checksum: http://imagecache.local:8080/${NEW_IMAGE_GENERATED}.sha256
        checksumType: sha256
        format: raw
        url: http://imagecache.local:8080/${NEW_IMAGE_GENERATED}.raw

Before applying the capi-provisioning-example.yaml file, it is always a good practice to inform external load balancers (e.g. MetalLB) about nodes being drained so that they do not route traffic to nodes in this state. As mentioned in the Section 59.1, “Load Balancer Exclusion” section, you can automate this by annotating the RKE2ControlPlane on the management cluster. In this example, an RKE2ControlPlane object called multinode-cluster is annotated:

kubectl annotate  RKE2ControlPlane/multinode-cluster  rke2.controlplane.cluster.x-k8s.io/load-balancer-exclusion="true"

Verify that the machine objects have been annotated:

pre-drain.delete.hook.machine.cluster.x-k8s.io/rke2-lb-exclusion: ""

Fetch the annotations for all your machine objects:

kubectl get machines -o json | jq -r '.items[].metadata | .name, .annotations'
Note
Note

Without these annotations users might experience longer response times for services as the load-balancers are unaware of drained nodes.

After making these changes, the capi-provisioning-example.yaml file can be applied to the cluster using the following command:

kubectl apply -f capi-provisioning-example.yaml