From c37d7d37295484364eeadd63e0bf81df9441599b Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 13:58:21 -0500 Subject: [PATCH 01/12] Update install.md --- docs/install.md | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/docs/install.md b/docs/install.md index 1bb9af66..f6a06a9f 100644 --- a/docs/install.md +++ b/docs/install.md @@ -25,8 +25,15 @@ cd k8s-nim-operator ```sh kubectl create ns nim-operator ``` +### 3. Export NGC CLI API KEY -### 3. Create an Image Pull Secret +Please refer to get [NGC API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) + +```sh +export NGC_CLI_API_KEY= +``` + +### 4. Create an Image Pull Secret Replace with your NGC CLI API key. @@ -34,17 +41,17 @@ Replace with your NGC CLI API key. kubectl create secret -n nim-operator docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ - --docker-password= + --docker-password=$NGC_CLI_API_KEY ``` -### 4. Install the NIM Operator +### 5. Install the NIM Operator Install the NIM Operator using the Helm chart located in the helm/k8s-nim-operator directory. ```sh helm install nim-operator helm/k8s-nim-operator -n nim-operator ``` -### 5. Verify Installation +### 6. Verify Installation Verify that the NIM Operator has been installed successfully by listing the Helm releases and checking the pods in the nim-operator namespace. ```sh From e8d8158c1533878e2d4a2323809d62c39c08a3a3 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 14:00:26 -0500 Subject: [PATCH 02/12] Update nimcache.md --- docs/nimcache.md | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/docs/nimcache.md b/docs/nimcache.md index aa4a9f61..0eb36dcb 100644 --- a/docs/nimcache.md +++ b/docs/nimcache.md @@ -24,8 +24,16 @@ Follow these steps to cache NIM models in a persistent volume. ```sh kubectl create ns nim-service ``` +### 2.Export NGC CLI API KEY +`NOTE:` Ignore this step if you already export the NGC_CLI_API_KEY -### 2. Create an Image Pull Secret for the NIM Container +Please refer to get [NGC API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) + +```sh +export NGC_CLI_API_KEY= +``` + +### 3. Create an Image Pull Secret for the NIM Container Replace with your NGC CLI API key. @@ -33,10 +41,10 @@ Replace with your NGC CLI API key. kubectl create secret -n nim-service docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ - --docker-password= + --docker-password=$NGC_CLI_API_KEY ``` -## 3. Create the NIM Cache Instance and Enable Model Auto-Detection +## 4. Create the NIM Cache Instance and Enable Model Auto-Detection Update the `NIMCache` custom resource (CR) with appropriate values for model selection. These include `model.precision`, `model.engine`, `model.qosProfile`, `model.gpu.product` and `model.gpu.ids`. @@ -77,13 +85,13 @@ spec: volumeAccessMode: ReadWriteOnce ``` -### 4. Create the CR +### 5. Create the CR ```sh kubectl create -f nimcache.yaml -n nim-service ``` -### 5. Verify the Progress of NIM Model Caching +### 6. Verify the Progress of NIM Model Caching Verify that the NIM Operator has initiated the caching job and track status via the CR. From 8aac41455b1c25038bc827d5a47d2a613417f0a8 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 14:07:20 -0500 Subject: [PATCH 03/12] Update nimservice.md --- docs/nimservice.md | 53 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/docs/nimservice.md b/docs/nimservice.md index 8cd46b17..cf0e1e85 100644 --- a/docs/nimservice.md +++ b/docs/nimservice.md @@ -72,6 +72,8 @@ meta-llama3-8b-instruct-job-xktnk 0/1 Completed 0 4m38s ### 3. Verify the Microservice is Running +#### Example 1: + Create a file, `verify-pod.yaml`, with contents like the following example: ```yaml @@ -129,6 +131,47 @@ Apply the manifest: kubectl create -f test-pod.yaml -n nim-service ``` +#### Example 2: + +Create a file, `verify-pod-2.yaml`, with contents like the following example: + +```yaml +--- +apiVersion: v1 +kind: Pod +metadata: + name: verify-streaming-chat-2 +spec: + containers: + - name: curl + image: curlimages/curl:8.6.0 + command: ['curl'] + args: + - -s + - -X + - "POST" + - 'http://meta-llama3-8b-instruct:8000/v1/completions' + - -H + - 'accept: application/json' + - -H + - 'Content-Type: application/json' + - --fail-with-body + - -d + - | + { + "model": "meta/llama3-8b-instruct", + "prompt": "Once upon a time", + "max_tokens": 64 + } + restartPolicy: Never +``` + +Apply the manifest: + +```sh +kubectl create -f verify-pod-2.yaml -n nim-service +``` + Confirm the verification pod ran to completion: ```sh @@ -140,4 +183,14 @@ NAME READY STATUS RESTARTS meta-llama3-8b-instruct-latest-db9d899fd-mfmq2 1/1 Running 0 112m meta-llama3-8b-instruct-latest-job-xktnk 0/1 Completed 0 8m8s verify-streaming-chat 0/1 Completed 0 99m +verify-streaming-chat-2 0/1 Completed 0 97m +``` +Verify the logs + +```sh +kubectl logs verify-streaming-chat-2 +``` + +```sh +kubectl logs verify-streaming-chat ``` From 81c1a4a3fdaeb8deaee59e0c142c1fb618e70a73 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 14:08:49 -0500 Subject: [PATCH 04/12] Update install.md --- docs/install.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/install.md b/docs/install.md index f6a06a9f..ff3da736 100644 --- a/docs/install.md +++ b/docs/install.md @@ -35,8 +35,6 @@ export NGC_CLI_API_KEY= ### 4. Create an Image Pull Secret -Replace with your NGC CLI API key. - ```sh kubectl create secret -n nim-operator docker-registry ngc-secret \ --docker-server=nvcr.io \ From 9fe2d7d2bbc3e7e4cc26ece031cdba9a70ac08f1 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 14:09:05 -0500 Subject: [PATCH 05/12] Update nimcache.md --- docs/nimcache.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/nimcache.md b/docs/nimcache.md index 0eb36dcb..5ba1dff6 100644 --- a/docs/nimcache.md +++ b/docs/nimcache.md @@ -35,8 +35,6 @@ export NGC_CLI_API_KEY= ### 3. Create an Image Pull Secret for the NIM Container -Replace with your NGC CLI API key. - ```sh kubectl create secret -n nim-service docker-registry ngc-secret \ --docker-server=nvcr.io \ From 44025f9f248e3b5ee0b8aa685f91b97d49dea47b Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 14:50:02 -0500 Subject: [PATCH 06/12] Update Examples --- docs/nimservice.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/nimservice.md b/docs/nimservice.md index cf0e1e85..f5fe9951 100644 --- a/docs/nimservice.md +++ b/docs/nimservice.md @@ -72,7 +72,7 @@ meta-llama3-8b-instruct-job-xktnk 0/1 Completed 0 4m38s ### 3. Verify the Microservice is Running -#### Example 1: +#### Example 1: Verify Streaming Chat Create a file, `verify-pod.yaml`, with contents like the following example: @@ -131,16 +131,16 @@ Apply the manifest: kubectl create -f test-pod.yaml -n nim-service ``` -#### Example 2: +#### Example 2: Verify Chat Completion -Create a file, `verify-pod-2.yaml`, with contents like the following example: +Create a file, `verify-chat-completions`, with contents like the following example: ```yaml --- apiVersion: v1 kind: Pod metadata: - name: verify-streaming-chat-2 + name: verify-chat-completions spec: containers: - name: curl @@ -169,7 +169,7 @@ spec: Apply the manifest: ```sh -kubectl create -f verify-pod-2.yaml -n nim-service +kubectl create -f verify-chat-completions.yaml -n nim-service ``` Confirm the verification pod ran to completion: @@ -183,14 +183,14 @@ NAME READY STATUS RESTARTS meta-llama3-8b-instruct-latest-db9d899fd-mfmq2 1/1 Running 0 112m meta-llama3-8b-instruct-latest-job-xktnk 0/1 Completed 0 8m8s verify-streaming-chat 0/1 Completed 0 99m -verify-streaming-chat-2 0/1 Completed 0 97m +verify-chat-completions 0/1 Completed 0 97m ``` Verify the logs ```sh -kubectl logs verify-streaming-chat-2 +kubectl logs verify-streaming-chat ``` ```sh -kubectl logs verify-streaming-chat +kubectl logs verify-chat-completions ``` From acff6fc12eda1bc50c41370bce88d03558398b96 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 14:52:23 -0500 Subject: [PATCH 07/12] Update NGC CLI API Key steps --- docs/nimcache.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/nimcache.md b/docs/nimcache.md index 5ba1dff6..cbb79cb4 100644 --- a/docs/nimcache.md +++ b/docs/nimcache.md @@ -27,7 +27,7 @@ kubectl create ns nim-service ### 2.Export NGC CLI API KEY `NOTE:` Ignore this step if you already export the NGC_CLI_API_KEY -Please refer to get [NGC API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) +Refer to get a [NGC CLI API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) ```sh export NGC_CLI_API_KEY= From 81d1ba58170f12d7b22e1ffc98028b2bb9182ad1 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Tue, 6 Aug 2024 15:11:08 -0500 Subject: [PATCH 08/12] Example Name Updates --- docs/nimservice.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/nimservice.md b/docs/nimservice.md index f5fe9951..41e9c76c 100644 --- a/docs/nimservice.md +++ b/docs/nimservice.md @@ -72,7 +72,7 @@ meta-llama3-8b-instruct-job-xktnk 0/1 Completed 0 4m38s ### 3. Verify the Microservice is Running -#### Example 1: Verify Streaming Chat +#### Example 1: OpenAI Chat Completion Request Create a file, `verify-pod.yaml`, with contents like the following example: @@ -131,7 +131,7 @@ Apply the manifest: kubectl create -f test-pod.yaml -n nim-service ``` -#### Example 2: Verify Chat Completion +#### Example 2: OpenAI Completion Request Create a file, `verify-chat-completions`, with contents like the following example: @@ -194,3 +194,4 @@ kubectl logs verify-streaming-chat ```sh kubectl logs verify-chat-completions ``` +For more information refer [Examples](https://docs.nvidia.com/nim/large-language-models/latest/getting-started.html#openai-completion-request) From 310a27c51fb67d2df7b28a605d175bad0ab57508 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Wed, 7 Aug 2024 10:06:46 -0500 Subject: [PATCH 09/12] Update gpus and NGC API Key --- docs/nimcache.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/docs/nimcache.md b/docs/nimcache.md index cbb79cb4..d63d6a3a 100644 --- a/docs/nimcache.md +++ b/docs/nimcache.md @@ -30,7 +30,7 @@ kubectl create ns nim-service Refer to get a [NGC CLI API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) ```sh -export NGC_CLI_API_KEY= +export NGC_API_KEY= ``` ### 3. Create an Image Pull Secret for the NIM Container @@ -39,10 +39,16 @@ export NGC_CLI_API_KEY= kubectl create secret -n nim-service docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ - --docker-password=$NGC_CLI_API_KEY + --docker-password=$NGC_API_KEY ``` -## 4. Create the NIM Cache Instance and Enable Model Auto-Detection +### 4. Create a NGC API Secret to pull the models + +```sh +kubectl create secret -n nim-service generic ngc-api-secret --from-literal=NGC_API_KEY=$NGC_API_KEY +``` + +## 5. Create the NIM Cache Instance and Enable Model Auto-Detection Update the `NIMCache` custom resource (CR) with appropriate values for model selection. These include `model.precision`, `model.engine`, `model.qosProfile`, `model.gpu.product` and `model.gpu.ids`. @@ -70,7 +76,7 @@ spec: precision: "fp8" engine: "tensorrt_llm" qosProfile: "throughput" - gpu: + gpus: product: "l40s" ids: - "26b5" From 0b1512bba5bee05c2b7b81585159bf4968e441e5 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Wed, 7 Aug 2024 10:07:07 -0500 Subject: [PATCH 10/12] Update NGC API Key --- docs/nimcache.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/nimcache.md b/docs/nimcache.md index d63d6a3a..9b027a83 100644 --- a/docs/nimcache.md +++ b/docs/nimcache.md @@ -25,7 +25,7 @@ Follow these steps to cache NIM models in a persistent volume. kubectl create ns nim-service ``` ### 2.Export NGC CLI API KEY -`NOTE:` Ignore this step if you already export the NGC_CLI_API_KEY +`NOTE:` Ignore this step if you already export the NGC_API_KEY Refer to get a [NGC CLI API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) From d40adbb95207eda67c1f328ecff10ed141c891ff Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Wed, 7 Aug 2024 10:07:53 -0500 Subject: [PATCH 11/12] Update NGC CLI API Key --- docs/install.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/install.md b/docs/install.md index ff3da736..6f4a1c64 100644 --- a/docs/install.md +++ b/docs/install.md @@ -27,10 +27,10 @@ kubectl create ns nim-operator ``` ### 3. Export NGC CLI API KEY -Please refer to get [NGC API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) +Please refer to get [NGC CLI API Key](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#ngc-api-keys) ```sh -export NGC_CLI_API_KEY= +export NGC_API_KEY= ``` ### 4. Create an Image Pull Secret @@ -39,7 +39,7 @@ export NGC_CLI_API_KEY= kubectl create secret -n nim-operator docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ - --docker-password=$NGC_CLI_API_KEY + --docker-password=$NGC_API_KEY ``` ### 5. Install the NIM Operator From 0fdd5044fc6db343d9b92599eb63814725fe3ab3 Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Wed, 7 Aug 2024 18:00:04 -0500 Subject: [PATCH 12/12] NIMService yaml update --- docs/nimservice.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/nimservice.md b/docs/nimservice.md index 41e9c76c..8cf2aad5 100644 --- a/docs/nimservice.md +++ b/docs/nimservice.md @@ -32,8 +32,7 @@ spec: nimCache: name: meta-llama3-8b-instruct profile: '' - scale: - minReplicas: 1 + replicas: 1 resources: limits: nvidia.com/gpu: 1