Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
SUSE Edge Documentation / SUSE Telco Cloud Documentation / Requirements & Assumptions

39 Requirements & Assumptions

39.1 Hardware

The hardware requirements for SUSE Telco Cloud are as follows:

  • Management cluster: The management cluster contains components like SUSE Linux Micro, RKE2, SUSE Rancher Prime, Metal3, and it is used to manage several downstream clusters. Depending on the number of downstream clusters to be managed, the hardware requirements for the server could vary.

    • Minimum requirements for the server (VM or bare-metal) are:

      • RAM: 8 GB Minimum (we recommend at least 16 GB)

      • CPU: 2 Minimum (we recommend at least 4 CPU)

  • Downstream clusters: The downstream clusters are the clusters deployed to run Telco workloads. Specific requirements are needed to enable certain Telco capabilities like SR-IOV, CPU Performance Optimization, etc.

    • SR-IOV: to attach VFs (Virtual Functions) in pass-through mode to CNFs/VNFs, the NIC must support SR-IOV and VT-d/AMD-Vi be enabled in the BIOS.

    • CPU Processors: To run specific Telco workloads, the CPU Processor model should be adapted to enable most of the features available in this reference table (Chapter 41, Telco features configuration).

    • Firmware requirements for installing with virtual media:

Server Hardware

BMC Model

Management

Dell hardware

15th Generation

iDRAC9

Supermicro hardware

01.00.25

Supermicro SMC - redfish

HPE hardware

1.50

iLO6

39.2 Network

As a reference for the network architecture, the following diagram shows a typical network architecture for a Telco environment:

product atip requirements1

The network architecture is based on the following components:

  • Management network: This network is used for the management of downstream cluster nodes. It is used for the out-of-band management. Usually, this network is also connected to a separate management switch, but it can be connected to the same service switch using VLANs to isolate the traffic.

  • Control-plane network: This network is used for the communication between the downstream cluster nodes and the services that are running on them. This network is also used for the communication between the nodes and the external services, like the DHCP or DNS servers. In some cases, for connected environments, the switch/router can handle traffic through the Internet.

  • Other networks: In some cases, nodes could be connected to other networks for specific purposes.

Note
Note

To use the directed network provisioning workflow, the management cluster must have network connectivity to the downstream cluster server Baseboard Management Controller (BMC) so that host preparation and provisioning can be automated.

39.3 Port requirements

To operate properly, a SUSE Telco Cloud deployment requires a number of ports to be reachable on the management and the downstream Kubernetes cluster nodes.

Note
Note

The exact list depends on the deployed optional components and the selected deployment options (e.g., CNI plug-in).

39.3.1 Management Nodes

The following table lists the opened ports in nodes running the management cluster:

Note
Note

For CNI plug-in related ports, see CNI specific port requirements (Section 39.3.3, “CNI specific port requirements”).

Protocol

Port

Source

Description

TCP

22

Any source that requires SSH access

SSH access to management cluster nodes

TCP

80

Load balancer/proxy that does external TLS termination

Rancher UI/API when external TLS termination is used

TCP

443

Any source that requires TLS access to Rancher UI/API

Rancher agent, Rancher UI/API

TCP

2379

RKE2 (management cluster) server nodes

etcd client port

TCP

2380

RKE2 (management cluster) server nodes

etcd peer port

TCP

6180

Any BMC(1) previously instructed by Metal3/ironic to pull an IPA(2) ramdisk image from this exposed port (non-TLS)

Ironic httpd non-TLS web server serving IPA(2) ISO images for virtual media based boot

In case this port is enabled, the functionally equivalent but TLS-enabled one (see below) is not opened

TCP

6185

Any BMC(1) previously instructed by Metal3/ironic to pull an IPA(2) ramdisk image from this exposed port (TLS)

Ironic httpd TLS-enabled web server serving IPA(2) ISO images for virtual media based boot

In case this port is enabled, the functionally equivalent but TLS-disabled one (see above) is not opened

TCP

6385

Any Metal3/ironic IPA(1) ramdisk image deployed & running in an "enrolled" BareMetalHost instance

Ironic API

TCP

6443

Any management cluster node; any external (to the management cluster) Kubernetes client

Kubernetes API

TCP

6545

Any management cluster node

Pull artifacts from OCI-compliant registry (Hauler)

TCP

9345

RKE2 server and agent nodes (management cluster)

RKE2 supervisor API for Node registration (opened port in all RKE2 server nodes)

TCP

10250

Any management cluster node

kubelet metrics

TCP/UDP/SCTP

30000-32767

Any external (to the management cluster) source accessing a service exposed on the primary network through a spec.type: NodePort or spec.type: LoadBalancer Service API object

Available NodePort port range

(1) BMC: Baseboard Management Controller
(2) IPA: Ironic Python Agent

39.3.2 Downstream Nodes

In SUSE Telco Cloud, before any (downstream) server becomes part of a running downstream Kubernetes cluster (or runs itself a single-node downstream Kubernetes cluster), it is required to go through some of the BaremetalHost Provisioning states.

  • The Baseboard Management Controller (BMC) for a just declared downstream server must be accessible through the out-of-band network. BMC is instructed (from the ironic service running on the management cluster) on the initial steps to take:

    1. Pull and load the indicated IPA ramdisk image in the BMC offered virtual media.

    2. Power-on the server.

Following ports are expected to be exposed from the BMC (they could differ depending on the exact hardware):

Protocol

Port

Source

Description

TCP

80

Ironic conductor (from management cluster)

Redfish API access (HTTP)

TCP

443

Ironic conductor (from management cluster)

Redfish API access (HTTPS)

  • Once the IPA ramdisk image loaded on the BMC virtual media is used to bootup the downstream server image, the hardware inspection phase begins. The following table lists the ports exposed by a running IPA ramdisk image:

Protocol

Port

Source

Description

TCP

22

Any source that requires SSH access to IPA ramdisk image

SSH access to a being inspected downstream cluster node

TCP

9999

Ironic conductor (from management cluster)

Ironic commands towards the running ramdisk image

  • Once the baremetal host is properly provisioned and has joined a downstream Kubernetes cluster, it exposes the following ports:

Note
Note

For CNI plug-in related ports, see CNI specific port requirements (Section 39.3.3, “CNI specific port requirements”).

Protocol

Port

Source

Description

TCP

22

Any source that requires SSH access

SSH access to downstream cluster nodes

TCP

80

Load balancer/proxy that does external TLS termination

Rancher UI/API when external TLS termination is used

TCP

443

Any source that requires TLS access to Rancher UI/API

Rancher agent, Rancher UI/API

TCP

2379

RKE2 (downstream cluster) server nodes

etcd client port

TCP

2380

RKE2 (downstream cluster) server nodes

etcd peer port

TCP

6443

Any downstream cluster node; any external (to the downstream cluster) Kubernetes client.

Kubernetes API

TCP

9345

RKE2 server and agent nodes (downstream cluster)

RKE2 supervisor API for Node registration (opened port in all RKE2 server nodes)

TCP

10250

Any downstream cluster node

kubelet metrics

TCP

10255

Any downstream cluster node

kubelet read-only access

TCP/UDP/SCTP

30000-32767

Any external (to the downstream cluster) source accessing a service exposed on the primary network through a spec.type: NodePort or spec.type: LoadBalancer Service API object

Available NodePort port range

39.3.3 CNI specific port requirements

Each supported CNI variant comes with its own set of port requirements. For more details, refer CNI Specific Inbound Network Rules in RKE2 documentation.

When cilium is set as default/primary CNI plug-in, following TCP port is additionally exposed when the cilium-operator workload is configured to expose metrics outside the Kubernetes cluster on which it is deployed. This ensures that an external Prometheus server instance running outside that Kubernetes cluster can still collect these metrics.

Note
Note

This is the default option when deploying cilium via the rke2-cilium Helm chart.

Protocol

Port

Source

Description

TCP

9963

External (to the Kubernetes cluster) metrics collector

cilium-operator metrics exposure

39.4 Services (DHCP, DNS, etc.)

Some external services like DHCP, DNS, etc. could be required depending on the kind of environment where they are deployed:

  • Connected environment: In this case, the nodes will be connected to the Internet (via routing L3 protocols) and the external services will be provided by the customer.

  • Disconnected / air-gap environment: In this case, the nodes will not have Internet IP connectivity and additional services will be required to locally mirror content required by the directed network provisioning workflow.

  • File server: A file server is used to store the OS images to be provisioned on the downstream cluster nodes during the directed network provisioning workflow. The Metal3 Helm chart can deploy a media server to store the OS images — check the following section (Note), but it is also possible to use an existing local webserver.

39.5 Disabling systemd services

For Telco workloads, it is important to disable or configure properly some of the services running on the nodes to avoid any impact on the workload performance running on the nodes (latency).

  • rebootmgr is a service which allows to configure a strategy for reboot when the system has pending updates. For Telco workloads, it is really important to disable or configure properly the rebootmgr service to avoid the reboot of the nodes in case of updates scheduled by the system, to avoid any impact on the services running on the nodes.

Note
Note

For more information about rebootmgr, see rebootmgr GitHub repository.

Verify the strategy being used by running:

cat /etc/rebootmgr.conf
[rebootmgr]
window-start=03:30
window-duration=1h30m
strategy=best-effort
lock-group=default

and you could disable it by running:

sed -i 's/strategy=best-effort/strategy=off/g' /etc/rebootmgr.conf

or using the rebootmgrctl command:

rebootmgrctl strategy off
Note
Note

This configuration to set the rebootmgr strategy can be automated using the directed network provisioning workflow. For more information, check the Automated Provisioning documentation (Chapter 42, Fully automated directed network provisioning).

  • transactional-update is a service that allows automatic updates controlled by the system. For Telco workloads, it is important to disable the automatic updates to avoid any impact on the services running on the nodes.

To disable the automatic updates, you can run:

systemctl --now disable transactional-update.timer
systemctl --now disable transactional-update-cleanup.timer
  • fstrim is a service that allows to trim the filesystems automatically every week. For Telco workloads, it is important to disable the automatic trim to avoid any impact on the services running on the nodes.

To disable the automatic trim, you can run:

systemctl --now disable fstrim.timer