@@ -26,6 +26,144 @@ The `standalone.py` script is designed to run within a Kubernetes environment. T
26
26
> [ !TIP]
27
27
> Check the ` show ` command to display an example of a Kubernetes Job that runs the script. Run ` ./standalone.py show ` .
28
28
29
+ ### RBAC Requirements when running in a Kubernetes Job
30
+
31
+ The script manipulates a number of Kubernetes resources, and therefore requires the following RBAC
32
+ permissions on the [ ServiceAccount] ( https://kubernetes.io/docs/concepts/security/service-accounts/ )
33
+ running the script:
34
+
35
+ ``` yaml
36
+ # logs
37
+ - verbs :
38
+ - get
39
+ - list
40
+ apiGroups :
41
+ - " "
42
+ resources :
43
+ - pods/log
44
+ # Jobs
45
+ - verbs :
46
+ - create
47
+ - get
48
+ - list
49
+ - watch
50
+ apiGroups :
51
+ - batch
52
+ resources :
53
+ - jobs
54
+ # Pods
55
+ - verbs :
56
+ - create
57
+ - get
58
+ - list
59
+ - watch
60
+ apiGroups :
61
+ - " "
62
+ resources :
63
+ - pods
64
+ # Secrets
65
+ - verbs :
66
+ - create
67
+ - get
68
+ apiGroups :
69
+ - " "
70
+ resources :
71
+ - secrets
72
+ # ConfigMaps
73
+ - verbs :
74
+ - create
75
+ - get
76
+ apiGroups :
77
+ - " "
78
+ resources :
79
+ - configmaps
80
+ # PVCs
81
+ - verbs :
82
+ - create
83
+ apiGroups :
84
+ - " "
85
+ resources :
86
+ - persistentvolumeclaims
87
+ # PyTorchJob
88
+ - verbs :
89
+ - create
90
+ - get
91
+ - list
92
+ - watch
93
+ apiGroups :
94
+ - kubeflow.org
95
+ resources :
96
+ - pytorchjobs
97
+ # Watchers
98
+ - verbs :
99
+ - get
100
+ - list
101
+ - watch
102
+ apiGroups :
103
+ - " "
104
+ resources :
105
+ - events
106
+ ` ` `
107
+
108
+ ### Run in a Kubernetes Job
109
+
110
+ The script can be run in a Kubernetes Job by creating a Job resource that runs the script. The
111
+ ` show` subcommand displays an example of a Kubernetes Job that runs the script:
112
+
113
+ ` ` ` bash
114
+ ./standalone/standalone.py show \
115
+ --image quay.io/opendatahub/workbench-images:jupyter-datascience-ubi9-python-3.11-20241004-609ffb8 \
116
+ --script-configmap standalone \
117
+ --script-name script \
118
+ --namespace leseb \
119
+ --args "--storage-class=nfs-csi" \
120
+ --args "--namespace=leseb" \
121
+ --args "--sdg-object-store-secret=sdg-object-store-credentials" \
122
+ --args "--judge-serving-model-secret=judge-serving-details"
123
+
124
+ apiVersion: batch/v1
125
+ kind: Job
126
+ metadata:
127
+ name: distributed-ilab
128
+ namespace: leseb
129
+ spec:
130
+ template:
131
+ spec:
132
+ containers:
133
+ - args:
134
+ - --storage-class=nfs-csi
135
+ - --namespace=leseb
136
+ - --sdg-object-store-secret=sdg-object-store-credentials
137
+ - --judge-serving-model-secret=judge-serving-details
138
+ command:
139
+ - python3
140
+ - /config/script
141
+ - run
142
+ image: quay.io/opendatahub/workbench-images:jupyter-datascience-ubi9-python-3.11-20241004-609ffb8
143
+ name: distributed-ilab
144
+ volumeMounts:
145
+ - mountPath: /config
146
+ name: script-config
147
+ restartPolicy: Never
148
+ serviceAccountName: default
149
+ volumes:
150
+ - configMap:
151
+ name: standalone
152
+ name: script-config
153
+ ` ` `
154
+
155
+ Optional arguments can be added to the `args` list to customize the script's behavior. They
156
+ represent the script options that would be passed to the script if run from the command line.
157
+
158
+ List of available options of the `show` subcommand :
159
+
160
+ * `--namespace`: Kubernetes namespace to run the job
161
+ * `--name`: Name of the job
162
+ * `--image`: The image to use for the job
163
+ * `--script-configmap`: The name of the ConfigMap that holds the script
164
+ * `--script-name`: The name of the script in the ConfigMap
165
+ * `--args`: Additional arguments to pass to the script - can be passed multiple times
166
+
29
167
# # Features
30
168
31
169
* Run any part of the InstructLab workflow in a standalone environment independently or a full end-to-end workflow:
@@ -36,7 +174,9 @@ The `standalone.py` script is designed to run within a Kubernetes environment. T
36
174
* Evaluate model by running MT_Bench with `evaluation` subcommand along with `--eval-type mt-bench` option.
37
175
* Final model evaluation with `evaluation` subcommand along with `--eval-type final` option.
38
176
* Final evaluation runs both MT Bench_Branch and MMLU_Branch
39
- * Push the final model back to the object store - same location as the SDG data with ` upload-trained-model ` subcommand.
177
+ * Push the final model back to the object store - same location as the SDG data with
178
+ ` upload-trained-model` subcommand.
179
+ * Dry-run mode to print the generated Kubernetes resources without executing - `--dry-run` option.
40
180
41
181
> [!NOTE]
42
182
> Read about InstructLab model evaluation in the [instructlab/eval repository](https://github.com/instructlab/eval/blob/main/README.md).
@@ -124,7 +264,9 @@ evaluation
124
264
* `--training-1-epoch-num`: The number of epochs to train the model for phase 1. **Optional** - Default: 7.
125
265
* `--training-2-epoch-num`: The number of epochs to train the model for phase 2. **Optional** -
126
266
Default : 10.
127
- * ` --eval-type ` : The evaluation type to use. ** Optional** - Default: ` mt-bench ` . Available options: ` mt-bench ` , ` final ` .
267
+ * `--eval-type`: The evaluation type to use. **Optional** - Default: `mt-bench`. Available options:
268
+ ` mt-bench` , `final`.
269
+ * `--dry-run`: Print the generated Kubernetes resources without executing them. **Optional** - Default: false.
128
270
129
271
130
272
# # Example Workflow with Synthetic Data Generation (SDG)
0 commit comments