56 Downstream cluster provisioning in air-gapped scenarios #
The directed network provisioning workflow allows to automate the provisioning of downstream clusters in air-gapped scenarios.
56.1 Requirements for air-gapped scenarios #
The
rawimage generated usingEIBmust include the specific container images (helm-chart OCI and container images) required to run the downstream cluster in an air-gapped scenario. For more information, refer to this section (Chapter 50, Prepare downstream cluster image for air-gap scenarios).In case of using SR-IOV or any other custom workload, the images required to run the workloads must be preloaded in your private registry following the preload private registry section (Section 50.2.7, “Preparing the air-gap artifacts”).
56.2 Enroll the bare-metal hosts in air-gap scenarios #
The process to enroll the bare-metal hosts in the management cluster is the same as described in the previous section (Chapter 51, Downstream cluster provisioning with Directed network provisioning (single-node)).
56.3 Provision the downstream cluster in air-gap scenarios #
There are some important changes required to provision the downstream cluster in air-gapped scenarios:
The
RKE2ControlPlaneblock in thecapi-provisioning-example.yamlfile must include thespec.agentConfig.airGapped: truedirective.The private registry configuration must be included in the
RKE2ControlPlaneblock in thecapi-provisioning-airgap-example.yamlfile following the private registry section (Chapter 55, Private registry).If you are using SR-IOV or any other
AdditionalUserDataconfiguration (combustion script) which requires the helm-chart installation, you must modify the content to reference the private registry instead of using the public registry.
The following example shows the SR-IOV configuration in the AdditionalUserData block in the capi-provisioning-airgap-example.yaml file with the modifications required to reference the private registry
Private Registry secrets references
Helm-Chart definition using the private registry instead of the public OCI images.
# secret to include the private registry certificates
apiVersion: v1
kind: Secret
metadata:
name: private-registry-cert
namespace: default
data:
tls.crt: ${TLS_BASE64_CERT}
tls.key: ${TLS_BASE64_KEY}
ca.crt: ${CA_BASE64_CERT}
type: kubernetes.io/tls
---
# secret to include the private registry auth credentials
apiVersion: v1
kind: Secret
metadata:
name: private-registry-auth
namespace: default
data:
username: ${REGISTRY_USERNAME}
password: ${REGISTRY_PASSWORD}
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta2
kind: RKE2ControlPlane
metadata:
name: single-node-cluster
namespace: default
spec:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: Metal3MachineTemplate
name: single-node-cluster-controlplane
replicas: 1
version: ${RKE2_VERSION}
rolloutStrategy:
type: "RollingUpdate"
rollingUpdate:
maxSurge: 0
privateRegistriesConfig: # Private registry configuration to add your own mirror and credentials
mirrors:
docker.io:
endpoint:
- "${PRIVATE_REGISTRY_URL}"
rewrite:
"^(.*)$": "mirror/$1"
registry.suse.com:
endpoint:
- "${PRIVATE_REGISTRY_URL}"
rewrite:
"^(.*)$": "mirror/$1"
registry.suse.de:
endpoint:
- "${PRIVATE_REGISTRY_URL}"
rewrite:
"^(.*)$": "mirror/$1"
registry.opensuse.org:
endpoint:
- "${PRIVATE_REGISTRY_URL}"
rewrite:
"^(.*)$": "mirror/$1"
registry.rancher.com:
endpoint:
- "${PRIVATE_REGISTRY_URL}"
rewrite:
"^(.*)$": "mirror/$1"
configs:
"192.168.100.22:5000":
authSecret:
apiVersion: v1
kind: Secret
namespace: default
name: private-registry-auth
tls:
tlsConfigSecret:
apiVersion: v1
kind: Secret
namespace: default
name: private-registry-cert
insecureSkipVerify: false
serverConfig:
cni: calico
cniMultusEnable: true
agentConfig:
airGapped: true # Airgap true to enable airgap mode
format: ignition
additionalUserData:
config: |
variant: fcos
version: 1.4.0
storage:
files:
- path: /var/lib/rancher/rke2/server/manifests/configmap-sriov-custom-auto.yaml
overwrite: true
contents:
inline: |
apiVersion: v1
kind: ConfigMap
metadata:
name: sriov-custom-auto-config
namespace: sriov-network-operator
data:
config.json: |
[
{
"resourceName": "${RESOURCE_NAME1}",
"interface": "${SRIOV-NIC-NAME1}",
"pfname": "${PF_NAME1}",
"driver": "${DRIVER_NAME1}",
"numVFsToCreate": ${NUM_VFS1}
},
{
"resourceName": "${RESOURCE_NAME2}",
"interface": "${SRIOV-NIC-NAME2}",
"pfname": "${PF_NAME2}",
"driver": "${DRIVER_NAME2}",
"numVFsToCreate": ${NUM_VFS2}
}
]
mode: 0644
user:
name: root
group:
name: root
- path: /var/lib/rancher/rke2/server/manifests/sriov.yaml
overwrite: true
contents:
inline: |
apiVersion: v1
data:
.dockerconfigjson: ${REGISTRY_AUTH_DOCKERCONFIGJSON}
kind: Secret
metadata:
name: privregauth
namespace: kube-system
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: ConfigMap
metadata:
namespace: kube-system
name: example-repo-ca
data:
ca.crt: |-
-----BEGIN CERTIFICATE-----
${CA_BASE64_CERT}
-----END CERTIFICATE-----
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: sriov-crd
namespace: kube-system
spec:
chart: oci://${PRIVATE_REGISTRY_URL}/mirror/sriov-crd
dockerRegistrySecret:
name: privregauth
repoCAConfigMap:
name: example-repo-ca
createNamespace: true
set:
global.clusterCIDR: 192.168.0.0/18
global.clusterCIDRv4: 192.168.0.0/18
global.clusterDNS: 10.96.0.10
global.clusterDomain: cluster.local
global.rke2DataDir: /var/lib/rancher/rke2
global.serviceCIDR: 10.96.0.0/12
targetNamespace: sriov-network-operator
version: 306.0.4+up1.6.0
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: sriov-network-operator
namespace: kube-system
spec:
chart: oci://${PRIVATE_REGISTRY_URL}/mirror/sriov-network-operator
dockerRegistrySecret:
name: privregauth
repoCAConfigMap:
name: example-repo-ca
createNamespace: true
set:
global.clusterCIDR: 192.168.0.0/18
global.clusterCIDRv4: 192.168.0.0/18
global.clusterDNS: 10.96.0.10
global.clusterDomain: cluster.local
global.rke2DataDir: /var/lib/rancher/rke2
global.serviceCIDR: 10.96.0.0/12
targetNamespace: sriov-network-operator
version: 306.0.4+up1.6.0
mode: 0644
user:
name: root
group:
name: root
kernel_arguments:
should_exist:
- intel_iommu=on
- iommu=pt
- idle=poll
- mce=off
- hugepagesz=1G hugepages=40
- hugepagesz=2M hugepages=0
- default_hugepagesz=1G
- irqaffinity=${NON-ISOLATED_CPU_CORES}
- isolcpus=domain,nohz,managed_irq,${ISOLATED_CPU_CORES}
- nohz_full=${ISOLATED_CPU_CORES}
- rcu_nocbs=${ISOLATED_CPU_CORES}
- rcu_nocb_poll
- nosoftlockup
- nowatchdog
- nohz=on
- nmi_watchdog=0
- skew_tick=1
- quiet
systemd:
units:
- name: rke2-preinstall.service
enabled: true
contents: |
[Unit]
Description=rke2-preinstall
Wants=network-online.target
Before=rke2-install.service
ConditionPathExists=!/run/cluster-api/bootstrap-success.complete
[Service]
Type=oneshot
User=root
ExecStartPre=/bin/sh -c "mount -L config-2 /mnt"
ExecStart=/bin/sh -c "sed -i \"s/BAREMETALHOST_UUID/$(jq -r .uuid /mnt/openstack/latest/meta_data.json)/\" /etc/rancher/rke2/config.yaml"
ExecStart=/bin/sh -c "echo \"node-name: $(jq -r .name /mnt/openstack/latest/meta_data.json)\" >> /etc/rancher/rke2/config.yaml"
ExecStart=/bin/sh -c "echo \"node-label:\" >> /etc/rancher/rke2/config.yaml"
ExecStart=/bin/sh -c "echo \" - metal3.io/uuid=$(jq -r .uuid /mnt/openstack/latest/meta_data.json)\" >> /etc/rancher/rke2/config.yaml"
ExecStartPost=/bin/sh -c "umount /mnt"
[Install]
WantedBy=multi-user.target
# rke2-traefik-deployment.service unit to be removed once "traefik" being the default ingress controller (starting with RKE2 v1.36)
- name: rke2-traefik-deployment.service
enabled: true
contents: |
[Unit]
Description=rke2-traefik-deployment
Wants=rke2-preinstall.service
Before=rke2-install.service
ConditionPathExists=!/run/cluster-api/bootstrap-success.complete
[Service]
Type=oneshot
User=root
ExecStart=/bin/sh -c "echo \"ingress-controller: traefik\" >> /etc/rancher/rke2/config.yaml"
[Install]
WantedBy=multi-user.target
- name: cpu-partitioning.service
enabled: true
contents: |
[Unit]
Description=cpu-partitioning
Wants=network-online.target
After=network.target network-online.target
[Service]
Type=oneshot
User=root
ExecStart=/bin/sh -c "echo isolated_cores=${ISOLATED_CPU_CORES} > /etc/tuned/cpu-partitioning-variables.conf"
ExecStartPost=/bin/sh -c "tuned-adm profile cpu-partitioning"
ExecStartPost=/bin/sh -c "systemctl enable tuned.service"
[Install]
WantedBy=multi-user.target
- name: performance-settings.service
enabled: true
contents: |
[Unit]
Description=performance-settings
Wants=network-online.target
After=network.target network-online.target cpu-partitioning.service
[Service]
Type=oneshot
User=root
ExecStart=/bin/sh -c "/opt/performance-settings/performance-settings.sh"
[Install]
WantedBy=multi-user.target
- name: sriov-custom-auto-vfs.service
enabled: true
contents: |
[Unit]
Description=SRIOV Custom Auto VF Creation
Wants=network-online.target rke2-server.target
After=network.target network-online.target rke2-server.target
[Service]
User=root
Type=forking
TimeoutStartSec=1800
ExecStart=/bin/sh -c "while ! /var/lib/rancher/rke2/bin/kubectl --kubeconfig=/etc/rancher/rke2/rke2.yaml wait --for condition=ready nodes --timeout=30m --all ; do sleep 10 ; done"
ExecStartPost=/bin/sh -c "/opt/sriov/sriov-auto-filler.sh"
RemainAfterExit=yes
KillMode=process
[Install]
WantedBy=multi-user.target
kubelet:
extraArgs:
- provider-id=metal3://BAREMETALHOST_UUID
nodeName: "localhost.localdomain"