-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to pretrain on GPU? #13
Comments
Hi Again. Thanks for opening this issue. Yes, there were several issues with GPU training back then as well. Perhaps you need to set the value to of Back then we tried different numbers of this value and for different people different things worked. Do try 1, however. Here is also potentially a useful stack-overflow script. https://stackoverflow.com/questions/39649102/how-do-i-select-which-gpu-to-run-a-job-on |
We won't be able to try running it on GPUs until perhaps in 1 or 2 weeks, if the issue still persists then, we will try it again on our machines and will keep you posted. |
Also, did I understand the question correctly? You did not manage to run it on the GPU locally? Or is it rather that you want to speed up the run on the GPU locally? If it is the latter then I can recommend the following:
|
Finally,
|
Hello again Nikolai! You did understand my question, thank you for making sure. I tried setting Pretraining does use the GPU when I set Is this behavior (low GPU memory usage, high CPU usage) expected during pretraining? |
Hello!
I am trying to pretrain an adapter using the
4_pretrain_adapter.sh
script.I have a GeForce RTX 2080 SUPER installed (~8GB VRAM), with NVIDIA Driver Version: 440.33.01, CUDA Version: 10.2 and tensorflow-gpu 1.15.5.
I set
CUDA_VISIBLE_DEVICES
to 0 in the4_pretrain_adapter.sh
script since I only have a single GPUPretraining have been running for 12-16hrs now and is just about completing the warmup phase (~10000 steps).
I noticed that the pretraining only uses ~115MB of VRAM but several threads on CPU driving its usage up to ~100%.
I started perusing the code for GPU usage options/parameters, but, but so far only found a switch for TPU usage and a comment stipulating that if TPU is not available, then the Estimator (
tf.contrib.tpu.TPUEstimator
) will fall back on CPU or GPU.I then looked at the official tensorflow documentation for TPUEstimator but no luck there either.
As I continue to look into this, I was wondering if you add some tips or advices about running the code locally on a single GPU.
The text was updated successfully, but these errors were encountered: