39 Requirements & Assumptions #
39.1 Hardware #
The hardware requirements for SUSE Telco Cloud are as follows:
Management cluster: The management cluster contains components like
SUSE Linux Micro
,RKE2
,SUSE Rancher Prime
,Metal3
, and it is used to manage several downstream clusters. Depending on the number of downstream clusters to be managed, the hardware requirements for the server could vary.Minimum requirements for the server (
VM
orbare-metal
) are:RAM: 8 GB Minimum (we recommend at least 16 GB)
CPU: 2 Minimum (we recommend at least 4 CPU)
Downstream clusters: The downstream clusters are the clusters deployed to run Telco workloads. Specific requirements are needed to enable certain Telco capabilities like
SR-IOV
,CPU Performance Optimization
, etc.SR-IOV: to attach VFs (Virtual Functions) in pass-through mode to CNFs/VNFs, the NIC must support SR-IOV and VT-d/AMD-Vi be enabled in the BIOS.
CPU Processors: To run specific Telco workloads, the CPU Processor model should be adapted to enable most of the features available in this reference table (Chapter 41, Telco features configuration).
Firmware requirements for installing with virtual media:
Server Hardware | BMC Model | Management |
Dell hardware | 15th Generation | iDRAC9 |
Supermicro hardware | 01.00.25 | Supermicro SMC - redfish |
HPE hardware | 1.50 | iLO6 |
39.2 Network #
As a reference for the network architecture, the following diagram shows a typical network architecture for a Telco environment:
The network architecture is based on the following components:
Management network: This network is used for the management of downstream cluster nodes. It is used for the out-of-band management. Usually, this network is also connected to a separate management switch, but it can be connected to the same service switch using VLANs to isolate the traffic.
Control-plane network: This network is used for the communication between the downstream cluster nodes and the services that are running on them. This network is also used for the communication between the nodes and the external services, like the
DHCP
orDNS
servers. In some cases, for connected environments, the switch/router can handle traffic through the Internet.Other networks: In some cases, nodes could be connected to other networks for specific purposes.
To use the directed network provisioning workflow, the management cluster must have network connectivity to the downstream cluster server Baseboard Management Controller (BMC) so that host preparation and provisioning can be automated.
39.3 Port requirements #
To operate properly, a SUSE Telco Cloud deployment requires a number of ports to be reachable on the management and the downstream Kubernetes cluster nodes.
The exact list depends on the deployed optional components and the selected deployment options (e.g., CNI plug-in).
39.3.1 Management Nodes #
The following table lists the opened ports in nodes running the management cluster:
For CNI plug-in related ports, see CNI specific port requirements (Section 39.3.3, “CNI specific port requirements”).
Protocol | Port | Source | Description |
TCP | 22 | Any source that requires SSH access | SSH access to management cluster nodes |
TCP | 80 | Load balancer/proxy that does external TLS termination | Rancher UI/API when external TLS termination is used |
TCP | 443 | Any source that requires TLS access to Rancher UI/API | Rancher agent, Rancher UI/API |
TCP | 2379 | RKE2 (management cluster) server nodes |
|
TCP | 2380 | RKE2 (management cluster) server nodes |
|
TCP | 6180 | Any BMC(1) previously instructed by |
|
TCP | 6185 | Any BMC(1) previously instructed by |
|
TCP | 6385 | Any | Ironic API |
TCP | 6443 | Any management cluster node; any external (to the management cluster) Kubernetes client | Kubernetes API |
TCP | 6545 | Any management cluster node | Pull artifacts from OCI-compliant registry (Hauler) |
TCP | 9345 | RKE2 server and agent nodes (management cluster) | RKE2 supervisor API for Node registration (opened port in all RKE2 server nodes) |
TCP | 10250 | Any management cluster node |
|
TCP/UDP/SCTP | 30000-32767 | Any external (to the management cluster) source accessing a service exposed on the primary network through a | Available |
(1) BMC: Baseboard Management Controller
(2) IPA: Ironic Python Agent
39.3.2 Downstream Nodes #
In SUSE Telco Cloud, before any (downstream) server becomes part of a running downstream Kubernetes cluster (or runs itself a single-node downstream Kubernetes cluster), it is required to go through some of the BaremetalHost Provisioning states.
The Baseboard Management Controller (BMC) for a just declared downstream server must be accessible through the out-of-band network. BMC is instructed (from the ironic service running on the management cluster) on the initial steps to take:
Pull and load the indicated IPA ramdisk image in the BMC offered
virtual media
.Power-on the server.
Following ports are expected to be exposed from the BMC (they could differ depending on the exact hardware):
Protocol | Port | Source | Description |
TCP | 80 | Ironic conductor (from management cluster) | Redfish API access (HTTP) |
TCP | 443 | Ironic conductor (from management cluster) | Redfish API access (HTTPS) |
Once the IPA ramdisk image loaded on the BMC
virtual media
is used to bootup the downstream server image, the hardware inspection phase begins. The following table lists the ports exposed by a running IPA ramdisk image:
Protocol | Port | Source | Description |
TCP | 22 | Any source that requires SSH access to IPA ramdisk image | SSH access to a being inspected downstream cluster node |
TCP | 9999 | Ironic conductor (from management cluster) | Ironic commands towards the running ramdisk image |
Once the baremetal host is properly provisioned and has joined a downstream Kubernetes cluster, it exposes the following ports:
For CNI plug-in related ports, see CNI specific port requirements (Section 39.3.3, “CNI specific port requirements”).
Protocol | Port | Source | Description |
TCP | 22 | Any source that requires SSH access | SSH access to downstream cluster nodes |
TCP | 80 | Load balancer/proxy that does external TLS termination | Rancher UI/API when external TLS termination is used |
TCP | 443 | Any source that requires TLS access to Rancher UI/API | Rancher agent, Rancher UI/API |
TCP | 2379 | RKE2 (downstream cluster) server nodes |
|
TCP | 2380 | RKE2 (downstream cluster) server nodes |
|
TCP | 6443 | Any downstream cluster node; any external (to the downstream cluster) Kubernetes client. | Kubernetes API |
TCP | 9345 | RKE2 server and agent nodes (downstream cluster) | RKE2 supervisor API for Node registration (opened port in all RKE2 server nodes) |
TCP | 10250 | Any downstream cluster node |
|
TCP | 10255 | Any downstream cluster node |
|
TCP/UDP/SCTP | 30000-32767 | Any external (to the downstream cluster) source accessing a service exposed on the primary network through a | Available |
39.3.3 CNI specific port requirements #
Each supported CNI variant comes with its own set of port requirements. For more details, refer CNI Specific Inbound Network Rules in RKE2 documentation.
When cilium
is set as default/primary CNI plug-in, following TCP port is additionally exposed when the cilium-operator workload is configured to expose metrics outside the Kubernetes cluster on which it is deployed. This ensures that an external Prometheus
server instance running outside that Kubernetes cluster can still collect these metrics.
This is the default option when deploying cilium
via the rke2-cilium Helm chart.
Protocol | Port | Source | Description |
TCP | 9963 | External (to the Kubernetes cluster) metrics collector | cilium-operator metrics exposure |
39.4 Services (DHCP, DNS, etc.) #
Some external services like DHCP
, DNS
, etc. could be required depending on the kind of environment where they are deployed:
Connected environment: In this case, the nodes will be connected to the Internet (via routing L3 protocols) and the external services will be provided by the customer.
Disconnected / air-gap environment: In this case, the nodes will not have Internet IP connectivity and additional services will be required to locally mirror content required by the directed network provisioning workflow.
File server: A file server is used to store the OS images to be provisioned on the downstream cluster nodes during the directed network provisioning workflow. The
Metal3
Helm chart can deploy a media server to store the OS images — check the following section (Note), but it is also possible to use an existing local webserver.
39.5 Disabling systemd services #
For Telco workloads, it is important to disable or configure properly some of the services running on the nodes to avoid any impact on the workload performance running on the nodes (latency).
rebootmgr
is a service which allows to configure a strategy for reboot when the system has pending updates. For Telco workloads, it is really important to disable or configure properly therebootmgr
service to avoid the reboot of the nodes in case of updates scheduled by the system, to avoid any impact on the services running on the nodes.
Verify the strategy being used by running:
cat /etc/rebootmgr.conf [rebootmgr] window-start=03:30 window-duration=1h30m strategy=best-effort lock-group=default
and you could disable it by running:
sed -i 's/strategy=best-effort/strategy=off/g' /etc/rebootmgr.conf
or using the rebootmgrctl
command:
rebootmgrctl strategy off
This configuration to set the rebootmgr
strategy can be automated using the directed network provisioning workflow. For more information, check the Automated Provisioning documentation (Chapter 42, Fully automated directed network provisioning).
transactional-update
is a service that allows automatic updates controlled by the system. For Telco workloads, it is important to disable the automatic updates to avoid any impact on the services running on the nodes.
To disable the automatic updates, you can run:
systemctl --now disable transactional-update.timer systemctl --now disable transactional-update-cleanup.timer
fstrim
is a service that allows to trim the filesystems automatically every week. For Telco workloads, it is important to disable the automatic trim to avoid any impact on the services running on the nodes.
To disable the automatic trim, you can run:
systemctl --now disable fstrim.timer