Skip to content

Commit

Permalink
Merge pull request #110 from sophongo/node
Browse files Browse the repository at this point in the history
[en] update kpanda workload pages
  • Loading branch information
windsonsea authored Nov 27, 2024
2 parents 6a66191 + 382a044 commit d0295fc
Show file tree
Hide file tree
Showing 22 changed files with 171 additions and 191 deletions.
2 changes: 1 addition & 1 deletion docs/en/end-user/baize/jobs/mpi.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Here we will use the `baize-notebook` base image and the **associated environmen

### Steps to Create an MPI Job

1. **Log in to the Platform** : Log in to the AI Lab platform and click on **Job Center** in the left navigation bar to enter the **Training Jobs** page.
1. **Log in to the Platform** : Log in to the AI Lab platform and click **Job Center** in the left navigation bar to enter the **Training Jobs** page.
2. **Create Job** : Click the **Create** button in the upper right corner to enter the job creation page.
3. **Select Job Type** : In the pop-up window, select the job type as `MPI`, then click **Next**.
4. **Fill in Job Information** : Fill in the job name and description, for example, “benchmarks-mpi”, then click **Next**.
Expand Down
2 changes: 1 addition & 1 deletion docs/en/end-user/baize/jobs/mxnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ We will use the `release-ci.daocloud.io/baize/kubeflow/mxnet-gpu:latest` image a

#### Steps to Create

1. **Log in to the Platform** : Log in to the AI Lab platform and click on **Job Center** in the left navigation bar to enter the **Training Jobs** page.
1. **Log in to the Platform** : Log in to the AI Lab platform and click **Job Center** in the left navigation bar to enter the **Training Jobs** page.
2. **Create Job** : Click the **Create** button in the upper right corner to enter the job creation page.
3. **Select Job Type** : In the pop-up window, select the job type as `MXNet`, then click **Next**.
4. **Fill in Job Information** : Enter the job name and description, for example, “MXNet Single-node Training Job”, then click **Confirm**.
Expand Down
2 changes: 1 addition & 1 deletion docs/en/end-user/baize/jobs/paddle.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ We use the `registry.baidubce.com/paddlepaddle/paddle:2.4.0rc0-cpu` image as the

#### Steps to Create

1. **Log in to the Platform** : Log in to the AI Lab platform and click on **Job Center** in the left navigation bar to enter the **Training Jobs** page.
1. **Log in to the Platform** : Log in to the AI Lab platform and click **Job Center** in the left navigation bar to enter the **Training Jobs** page.
2. **Create Job** : Click the **Create** button in the upper right corner to enter the job creation page.
3. **Select Job Type** : In the pop-up window, select the job type as `PaddlePaddle`, then click **Next**.
4. **Fill in Job Information** : Enter the job name and description, for example, “PaddlePaddle Single-node Training Job”, then click **Confirm**.
Expand Down
6 changes: 3 additions & 3 deletions docs/en/end-user/baize/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Prepare your development environment by clicking on **Notebooks** in the navigat

## Creating a Training Job

1. Click on **Job Center** -> **Training Jobs** in the navigation bar to create a standalone `TensorFlow` job.
1. Click **Job Center** -> **Training Jobs** in the navigation bar to create a standalone `TensorFlow` job.
2. Fill in the basic parameters and click **Next**.
3. In the job resource configuration, correctly set up the job resources and click **Next**.

Expand All @@ -96,8 +96,8 @@ Prepare your development environment by clicking on **Notebooks** in the navigat
Logs will be saved in the output dataset at `/home/jovyan/model/train/logs/`.


5. Return to the training job list and wait for the status to change to **Success**. Click on the **┇** icon on the right side of the list to view details, clone jobs, update priority, view logs, and delete jobs, among other options.
5. Return to the training job list and wait for the status to change to **Success**. Click the **┇** icon on the right side of the list to view details, clone jobs, update priority, view logs, and delete jobs, among other options.

6. Once the job is successfully created, click on **Job Analysis** in the left navigation bar to check the job status and fine-tune your training.
6. Once the job is successfully created, click **Job Analysis** in the left navigation bar to check the job status and fine-tune your training.

![View Job](./images/baize-07.png)
2 changes: 2 additions & 0 deletions docs/en/end-user/kpanda/nodes/add-node.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
---
hide:
- toc
MTPE: FanLin
Date: 2024-02-27
---
Expand Down
2 changes: 0 additions & 2 deletions docs/en/end-user/kpanda/nodes/delete-node.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,7 @@ When the peak business period is over, in order to save resource costs, you can
## Precautions

1. When cluster nodes scales down, they can only be uninstalled one by one, not in batches.

2. If you need to uninstall cluster controller nodes, you need to ensure that the final number of controller nodes is an **odd number**.

3. The **first controller** node cannot be offline when the cluster node scales down. If it is necessary to perform this operation, please contact the after-sales engineer.

## Steps
Expand Down
7 changes: 5 additions & 2 deletions docs/en/end-user/kpanda/nodes/node-authentication.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Node Authentication
---
hide:
- toc
---

## Authenticate Nodes Using SSH Keys
# Node Authentication

If you choose to authenticate the nodes of the cluster-to-be-created using SSH keys, you need to configure the public and private keys according to the following instructions.

Expand Down
4 changes: 2 additions & 2 deletions docs/en/end-user/kpanda/workloads/create-cronjob.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@ CronJobs are suitable for performing periodic operations, such as backup and rep

Before creating a CronJob, the following prerequisites need to be met:

- In the Container Management module [Integrate Kubernetes Cluster](../clusters/integrate-cluster.md) or [Create Kubernetes Cluster](../clusters/create-cluster.md), and can access the cluster UI interface.
- You have integrated a Kubernetes Cluster in the Container Management module as described in [Integrate Kubernetes Cluster](../clusters/integrate-cluster.md) or [Create Kubernetes Cluster](../clusters/create-cluster.md), and you can access the cluster's UI interface.

- Create a [namespace](../namespaces/createns.md) and a user.

- The current operating user should have [NS Editor](../permissions/permission-brief.md#ns-editor) or higher permissions, for details, refer to [Namespace Authorization](../namespaces/createns.md).

- When there are multiple containers in a single instance, please make sure that the ports used by the containers do not conflict, otherwise the deployment will fail.
- When there are multiple containers in a single instance, please make sure that the ports used by the containers do not conflict, otherwise the CronJob will fail.

## Create by image

Expand Down
4 changes: 2 additions & 2 deletions docs/en/end-user/kpanda/workloads/create-daemonset.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,13 @@ For simplicity, a DaemonSet can be started on each node for each type of daemon.

Before creating a DaemonSet, the following prerequisites need to be met:

- In the Container Management module [Integrate Kubernetes Cluster](../clusters/integrate-cluster.md) or [Create Kubernetes Cluster](../clusters/create-cluster.md), and can access the cluster UI interface.
- You have integrated a Kubernetes Cluster in the Container Management module as described in [Integrate Kubernetes Cluster](../clusters/integrate-cluster.md) or [Create Kubernetes Cluster](../clusters/create-cluster.md), and you can access the cluster's UI interface.

- Create a [namespace](../namespaces/createns.md) and a user.

- The current operating user should have [NS Editor](../permissions/permission-brief.md#ns-editor) or higher permissions, for details, refer to [Namespace Authorization](../namespaces/createns.md).

- When there are multiple containers in a single instance, please make sure that the ports used by the containers do not conflict, otherwise the deployment will fail.
- When there are multiple containers in a single instance, please make sure that the ports used by the containers do not conflict, otherwise the DaemonSet will fail.

## Create by image

Expand Down
Loading

0 comments on commit d0295fc

Please sign in to comment.