Skip to content

Commit

Permalink
Add guide to overwrite dataset path
Browse files Browse the repository at this point in the history
  • Loading branch information
ZhiyuLi-goog committed Jan 10, 2025
1 parent 70fed93 commit 5fa2abb
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions mixture_of_experts_pretraining/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,13 @@ python ~/xpk/xpk.py workload create \
--num-slices=<num_slices> \
--command="bash script.sh"
```
Note that the dataset path defaults as follows in [`dataset/c4_mlperf.yaml`](config/dataset/c4_mlperf.yaml)
```
train_dataset_path: gs://mlperf-llm-public2/c4/en_json/3.0.1
eval_dataset_path: gs://mlperf-llm-public2/c4/en_val_subset_json
```
You can freely overwrite the workload command by adding
`dataset.train_dataset_path=/path/to/train/dir dataset.eval_dataset_path=/path/to/eval/dir`, and the path should support both local directory and gcs buckets.
## Run Experiments in GCE
Expand Down Expand Up @@ -326,6 +333,14 @@ EOF
"
```
Note that the dataset path defaults as follows in [`dataset/c4_mlperf.yaml`](config/dataset/c4_mlperf.yaml)
```
train_dataset_path: gs://mlperf-llm-public2/c4/en_json/3.0.1
eval_dataset_path: gs://mlperf-llm-public2/c4/en_val_subset_json
```
You can freely overwrite the workload command by adding
`dataset.train_dataset_path=/path/to/train/dir dataset.eval_dataset_path=/path/to/eval/dir`, and the path should support both local directory and gcs buckets.
#### Logging
The workload starts only after all worker SSH connections are established, then it is safe and recommended to manually exit.
The provided scripts may exceed the SSH connection timeout without manully exit, causing unexpected command retries, which may lead to some error message stating that command error since the TPU devices are currently in use. However, this should not disrupt your existing workload.
Expand Down

0 comments on commit 5fa2abb

Please sign in to comment.