Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference_demo script loads after some delay #1076

Open
EmilyWebber opened this issue Jan 2, 2025 · 5 comments
Open

inference_demo script loads after some delay #1076

EmilyWebber opened this issue Jan 2, 2025 · 5 comments

Comments

@EmilyWebber
Copy link

EmilyWebber commented Jan 2, 2025

Please add a note that the suggested command to check the install of NxD inference, inference_demo --help, can take an extended period of time to load the first time it's invoked.

https://github.com/aws-neuron/aws-neuron-sdk/blob/master/libraries/nxd-inference/nxdi-setup.rst#verify-nxd-inference-installation

@EmilyWebber EmilyWebber changed the title inference_demo script loads after multiple minutes inference_demo script loads after some delay Jan 2, 2025
@AWSNB
Copy link
Contributor

AWSNB commented Jan 2, 2025 via email

@jluntamazon
Copy link
Contributor

I attempted to reproduce this with a fresh environment (on an existing instance):

python3 -m venv venv
source venv/bin/activate
python -m pip install --extra-index-url https://pip.repos.neuron.amazonaws.com neuronx-cc==2.* neuronx-distributed-inference
time inference_demo --help

This resulted in:

real	0m4.559s

As @AWSNB mentioned this may be due to instance setup effects.

@EmilyWebber Can you validate my example above produces a similar timing to mine?

@EmilyWebber
Copy link
Author

EmilyWebber commented Jan 3, 2025

I picked a PyTorch environment, then followed the steps for a manual install on the packages listed here.

source /opt/aws_neuronx_venv_pytorch_2_5/bin/activate
pip install --upgrade neuronx-cc==2.* neuronx-distributed-inference --extra-index-url https://pip.repos.neuron.amazonaws.com
time inference_demo --help

The result:

real 2m34.913s

If I use the aws_neuronx_venv_pytorch_2_5_nxd_inference venv, with the packages preinstalled, here's what I get.

source /opt/aws_neuronx_venv_pytorch_2_5_nxd_inference/bin/activate
time inference_demo --help

This results in:
real 3m13.345s

Oddly enough when I create a new Python venv, do a fresh install and invoke the helper script, as suggested above, I absolutely do see the same 4s load time!

@AWSNB
Copy link
Contributor

AWSNB commented Jan 4, 2025 via email

@EmilyWebber
Copy link
Author

Just a trn1.2xlarge - I used the default settings in the launcher maybe 360 GB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants