Adding AxoNN's 3D tensor parallelism [WIP] #1086
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Steps to run -
Install AxoNN (dependencies - Pytorch and mpi4py) -
Preparing a config file to use AxoNN -
First, set
"use_axonn_model_parallelism"
: trueThen set
"depth_model_parallel_size"
,"row_model_parallel_size"
,"column_model_parallel_size"
as per the requirements of your model. The product of these should equal"model_parallel_size"
.you can also set
"optimize_axonn_communication: true"
to enable communication optimizations. These also require you to set the environment variable - export CUDA_DEVICE_MAX_CONNECTIONS=1At a high level, the matrix multiplications in your model will be sharded over
"model_parallel_size"
GPUs.ToDos -