Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: add mtu size configuration for SpiderMultusConfig #4646

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions docs/usage/install/ai/get-started-macvlan-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,18 @@

- 在 Infiniband 的 IPOIB 网卡上不支持创建 Macvlan 接口,因此,本方案只能适用在 RoCE 网络场景下,不能使用在 infiniband 网络场景下。

## 对比 SR-IOV CNI 的 RDMA 方案

| 比较维度 | Macvlan 共享 RDMA 方案 | SR-IOV CNI 隔离 RDMA 方案 |
| ------------| ------------------------------------- | --------------------------------- |
| 网络隔离 | 所有容器共享 RDMA 设备,隔离性较差 | 容器独享 RDMA 设备,隔离性较好 |
| 性能 | 性能较高 | 硬件直通,性能最优 |
| 资源利用率 | 资源利用率较高 | 较低,受硬件支持的 VFs 数量限制 |
| 配置复杂度 | 配置相对简单 | 配置较为复杂,需要硬件支持和配置 |
| 兼容性 | 兼容性较好,适用于大多数环境 | 依赖硬件支持,兼容性较差 |
| 适用场景 | 适用于大多数场景,包括裸金属,虚拟机等 | 只适用于裸金属,不适用于虚拟机场景 |
| 成本 | 成本较低,因为不需要额外的硬件支持 | 成本较高,需要支持 SR-IOV 的硬件设备 |

## 方案

本文将以如下典型的 AI 集群拓扑为例,介绍如何搭建 Spiderpool。
Expand Down Expand Up @@ -246,6 +258,31 @@
EOF
```

在一些特殊通信场景下,用户需要为 Pod 自定义 MTU 大小以满足不同数据报文通信需求。您可以通过以下方式自定义配置 Pod 的 MTU 大小:

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: gpu1-macvlan
namespace: spiderpool
spec:
cniType: macvlan
rdmaResourceName: spidernet.io/shared_cx5_gpu1
macvlan:
master: ["enp11s0f0np0"]
ippools:
ipv4: ["gpu1-net11"]
chainCNIJsonData:
- |
{
"type": "tuning",
"mtu": 1480
}
```

注意: MTU 的取值范围不应该大于 macvlan master 网卡的 MTU 值,否则无法创建 Pod。

## 创建测试应用

1. 在指定节点上创建一组 DaemonSet 应用
Expand Down
37 changes: 37 additions & 0 deletions docs/usage/install/ai/get-started-macvlan.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,18 @@ By using [RDMA shared device plugin](https://github.com/Mellanox/k8s-rdma-shared

- Macvlan interfaces cannot be created on an Infiniband IPOIB network card, so this solution is only applicable in RoCE network scenarios and cannot be used in Infiniband network scenarios.

## Comparison of SR-IOV CNI RDMA Solution

| Comparison Dimension | Macvlan Shared RDMA Solution | SR-IOV CNI Isolated RDMA Solution |
| -------------------- | ---------------------------------- | ---------------------------------- |
| Network Isolation | All containers share RDMA devices, poor isolation | Containers have dedicated RDMA devices, good isolation |
| Performance | High performance | Optimal performance with hardware passthrough |
| Resource Utilization | High resource utilization | Low, limited by the number of supported VFs |
| Configuration Complexity | Relatively simple configuration | More complex configuration, requires hardware support |
| Compatibility | Good compatibility, suitable for most environments | Depends on hardware support, less compatible |
| Applicable Scenarios | Suitable for most scenarios, including bare metal and VMs | Only suitable for bare metal, not for VM scenarios |
| Cost | Low cost, no additional hardware support needed | High cost, requires hardware supporting SR-IOV |

## Solution

This article will introduce how to set up Spiderpool using the following typical AI cluster topology as an example.
Expand Down Expand Up @@ -246,6 +258,31 @@ The network planning for the cluster is as follows:
EOF
```

In some special communication scenarios, users need to customize the MTU size for Pods to meet the communication needs of different data packets. You can customize the MTU size for Pods in the following way.

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: gpu1-macvlan
namespace: spiderpool
spec:
cniType: macvlan
rdmaResourceName: spidernet.io/shared_cx5_gpu1
macvlan:
master: ["enp11s0f0np0"]
ippools:
ipv4: ["gpu1-net11"]
chainCNIJsonData:
- |
{
"type": "tuning",
"mtu": 1480
}
```

Note: The MTU value should not exceed the MTU value of the macvlan master network interface, otherwise the Pod cannot be created.

## Create a Test Application

1. Create a DaemonSet application on specified nodes.
Expand Down
59 changes: 51 additions & 8 deletions docs/usage/install/ai/get-started-sriov-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,22 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb

2. RoCE 网络场景下, 使用了 [SR-IOV CNI](https://github.com/k8snetworkplumbingwg/sriov-cni) 来暴露宿主机上的 RDMA 网卡给 Pod 使用,暴露 RDMA 资源。可额外使用 [RDMA CNI](https://github.com/k8snetworkplumbingwg/rdma-cni) 来完成 RDMA 设备隔离。

注意:

- 基于 SR-IOV 技术给容器提供 RDMA 通信能力只适用于裸金属环境,不适用于虚拟机环境。

## 对比 Macvlan CNI 的 RDMA 方案

| 比较维度 | Macvlan 共享 RDMA 方案 | SR-IOV CNI 隔离 RDMA 方案 |
| ------------| ------------------------------------- | --------------------------------- |
| 网络隔离 | 所有容器共享 RDMA 设备,隔离性较差 | 容器独享 RDMA 设备,隔离性较好 |
| 性能 | 性能较高 | 硬件直通,性能最优 |
| 资源利用率 | 资源利用率较高 | 较低,受硬件支持的 VFs 数量限制 |
| 配置复杂度 | 配置相对简单 | 配置较为复杂,需要硬件支持和配置 |
| 兼容性 | 兼容性较好,适用于大多数环境 | 依赖硬件支持,兼容性较差 |
| 适用场景 | 适用于大多数场景,包括裸金属,虚拟机等 | 只适用于裸金属,不适用于虚拟机场景 |
| 成本 | 成本较低,因为不需要额外的硬件支持 | 成本较高,需要支持 SR-IOV 的硬件设备 |

## 方案

本文将以如下典型的 AI 集群拓扑为例,介绍如何搭建 Spiderpool
Expand Down Expand Up @@ -218,10 +234,10 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb
priority: 99
numVfs: 12
nicSelector:
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
linkType: ${LINK_TYPE}
deviceType: netdevice
isRdma: true
Expand All @@ -238,10 +254,10 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb
priority: 99
numVfs: 12
nicSelector:
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
linkType: ${LINK_TYPE}
deviceType: netdevice
isRdma: true
Expand Down Expand Up @@ -366,6 +382,33 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb
EOF
```

4.(可选)自定义 SR-IOV VF 的 MTU

在一些特殊通信场景下,用户需要为 Pod 自定义 MTU 大小以满足不同数据报文通信需求。您可以通过以下方式自定义配置 Pod 的 MTU 大小(以 Ethernet 为例):

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: gpu1-sriov
namespace: spiderpool
spec:
cniType: sriov
sriov:
resourceName: spidernet.io/gpu1sriov
enableRdma: true
ippools:
ipv4: ["gpu1-net11"]
chainCNIJsonData:
- |
{
"type": "tuning",
"mtu": 1480
}
```

注意:MTU 的取值范围不应该大于 sriov PF 的 MTU 值。

## 创建测试应用

1. 在指定节点上创建一组 DaemonSet 应用,测试指定节点上的 SR-IOV 设备的可用性
Expand Down
95 changes: 69 additions & 26 deletions docs/usage/install/ai/get-started-sriov.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,22 @@ Different CNIs are used for different network scenarios:

2. In RoCE network scenarios, the [SR-IOV CNI](https://github.com/k8snetworkplumbingwg/sriov-cni) is used to expose the RDMA network interface on the host to the Pod, thereby exposing RDMA resources. Additionally, the [RDMA CNI](https://github.com/k8snetworkplumbingwg/rdma-cni) can be used to achieve RDMA device isolation.

Note:

- Based on SR-IOV technology, the RDMA communication capability of containers is only applicable to bare metal environments, not to virtual machine environments.

## Comparison of Macvlan CNI RDMA Solution

| Comparison Dimension | Macvlan Shared RDMA Solution | SR-IOV CNI Isolated RDMA Solution |
| -------------------- | ---------------------------------- | ---------------------------------- |
| Network Isolation | All containers share RDMA devices, poor isolation | Containers have dedicated RDMA devices, good isolation |
| Performance | High performance | Optimal performance with hardware passthrough |
| Resource Utilization | High resource utilization | Low, limited by the number of supported VFs |
| Configuration Complexity | Relatively simple configuration | More complex configuration, requires hardware support |
| Compatibility | Good compatibility, suitable for most environments | Depends on hardware support, less compatible |
| Applicable Scenarios | Suitable for most scenarios, including bare metal and VMs | Only suitable for bare metal, not for VM scenarios |
| Cost | Low cost, no additional hardware support needed | High cost, requires hardware supporting SR-IOV |

## Solution

This article will introduce how to set up Spiderpool using the following typical AI cluster topology as an example.
Expand Down Expand Up @@ -213,39 +229,39 @@ The network planning for the cluster is as follows:
name: gpu1-nic-policy
namespace: spiderpool
spec:
nodeSelector:
kubernetes.io/os: "linux"
resourceName: gpu1sriov
priority: 99
numVfs: 12
nicSelector:
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
linkType: ${LINK_TYPE}
deviceType: netdevice
isRdma: true
nodeSelector:
kubernetes.io/os: "linux"
resourceName: gpu1sriov
priority: 99
numVfs: 12
nicSelector:
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
linkType: ${LINK_TYPE}
deviceType: netdevice
isRdma: true
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: gpu2-nic-policy
namespace: spiderpool
spec:
nodeSelector:
kubernetes.io/os: "linux"
resourceName: gpu2sriov
priority: 99
numVfs: 12
nicSelector:
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
linkType: ${LINK_TYPE}
deviceType: netdevice
isRdma: true
nodeSelector:
kubernetes.io/os: "linux"
resourceName: gpu2sriov
priority: 99
numVfs: 12
nicSelector:
deviceID: "1017"
vendor: "15b3"
rootDevices:
- 0000:86:00.0
linkType: ${LINK_TYPE}
deviceType: netdevice
isRdma: true
EOF
```

Expand Down Expand Up @@ -367,6 +383,33 @@ The network planning for the cluster is as follows:
EOF
```

4. (Optional) Customize the MTU of SR-IOV VF

In some special communication scenarios, users need to customize the MTU size for Pods to meet the communication requirements of different data packets. You can customize the Pod's MTU configuration as follows (using Ethernet as an example):

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: gpu1-sriov
namespace: spiderpool
spec:
cniType: sriov
sriov:
resourceName: spidernet.io/gpu1sriov
enableRdma: true
ippools:
ipv4: ["gpu1-net11"]
chainCNIJsonData:
- |
{
"type": "tuning",
"mtu": 1480
}
```

Note: The MTU value should not exceed the MTU value of the sriov PF.

## Create a Test Application

1. Create a DaemonSet application on a specified node to test the availability of SR-IOV devices on that node.
Expand Down
Loading