-
Notifications
You must be signed in to change notification settings - Fork 470
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[HuggingFace][Neuronx] Training - Optimum Neuron 0.0.25 - Neuron sdk …
…2.20.0 - Transformers to 4.43.2 (#4365) * feat(neuronx): add 0.0.25 training DLC * fix(neuronx): add mlflow vulnerabilities to allow-list These vulnerabilities were already added for the pytorch training DLCs. * REVERTME: activate neuronx train CI build * fix(neuronx): apparmor and gevent vulnerabilities * fix: add werkzeug exception (Windows vuln) * fix: add another werkzeug exception * fix: pin sagemaker version to stop importing errors * fix: try to remove tensorboard 2.6 error * fix: add mlflow and gunicorn exceptions * fix: yet another mlflow vuln * Revert "REVERTME: activate neuronx train CI build" This reverts commit 7cff21f. --------- Co-authored-by: Malav Shastri <[email protected]>
- Loading branch information
1 parent
f3f70fa
commit 36a6aab
Showing
4 changed files
with
1,991 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
57 changes: 57 additions & 0 deletions
57
huggingface/pytorch/training/docker/2.1/py3/sdk2.20.0/Dockerfile.neuronx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# https://github.com/aws/deep-learning-containers/blob/master/available_images.md | ||
# refer to the above page to pull latest PyTorch Neuronx image | ||
|
||
# docker image region us-west-2 | ||
FROM 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training-neuronx:2.1.2-neuronx-py310-sdk2.20.0-ubuntu20.04 | ||
|
||
|
||
LABEL maintainer="Amazon AI" | ||
LABEL dlc_major_version="1" | ||
|
||
# version args | ||
ARG OPTIMUM_NEURON_VERSION=0.0.25 | ||
ARG TRANSFORMERS_VERSION | ||
ARG DATASETS_VERSION | ||
ARG GEVENT_VERSION=24.10.3 | ||
ARG GAUTH_VERSION=1.35.0 | ||
ARG PYTHON=python3 | ||
|
||
# install Hugging Face libraries and its dependencies | ||
RUN pip install --no-cache-dir \ | ||
"sagemaker==2.232.2" \ | ||
evaluate \ | ||
transformers[sklearn,sentencepiece,audio,vision]==${TRANSFORMERS_VERSION} \ | ||
datasets==${DATASETS_VERSION} \ | ||
optimum-neuron==${OPTIMUM_NEURON_VERSION} \ | ||
peft \ | ||
google-auth==${GAUTH_VERSION} \ | ||
gevent==${GEVENT_VERSION} | ||
|
||
# Pin numpy to version required by neuronx-cc | ||
# Update Pillow and urllib version to fix high and critical vulnerabilities | ||
RUN pip install -U \ | ||
"numpy>=1.24.3,<=1.25.2" \ | ||
"numba==0.58.1" \ | ||
"Pillow==10.3.0" \ | ||
"requests<2.32.0" \ | ||
"urllib3>=1.26.17,<1.27" | ||
|
||
RUN apt-get update \ | ||
&& apt install -y --no-install-recommends \ | ||
git-lfs \ | ||
libgssapi-krb5-2 \ | ||
libexpat1 \ | ||
expat \ | ||
libarchive13 \ | ||
&& apt-get upgrade -y apparmor \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
RUN HOME_DIR=/root \ | ||
&& curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \ | ||
&& unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \ | ||
&& cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \ | ||
&& chmod +x /usr/local/bin/testOSSCompliance \ | ||
&& chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \ | ||
&& ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \ | ||
&& rm -rf ${HOME_DIR}/oss_compliance* |
Oops, something went wrong.