Replies: 3 comments 28 replies
-
I'm running large model with beam size of 5 on a GTX1070 and don't get any freezing after numerous requests and TTS generation as well. I think as far as ram/processing the two are similar if I am not mistaken? Have you taken a look at output from nvidia-smi to see if the card seems ok? |
Beta Was this translation helpful? Give feedback.
-
IIRR I also saw freezes when sharing the GPU with X, and had to disallow
that.
|
Beta Was this translation helpful? Give feedback.
-
Sorry, I was out for a long weekend. This is a great thread! We have several devs using P4s with WIS and we haven't observed this "stuck in P0" behavior. However, as @nikito says running Xorg on a Tesla P4 (it doesn't even have video out) is very strange and I have no idea what happens in this scenario. Generally we've observed identical behavior @nikito has observed with the GTX 1070 on the Tesla P4 - idle low with models loaded, brief spike (1 sec or less) for requests, idle dropping back down for a short period of time, then idle dropping back down to the starting WIS + models level. Generally, this is default behavior that we can modify via nvidia-smi, etc for WIS but the defaults work fairly well across GPUs in our testing. FYI the higher idle for a time period post initial spike is for higher performance and lower latency on subsequent follow-up requests, so things like:
We've had some skepticism with suggesting the Tesla P4 with passive cooling but in our testing and even very high Willow use we haven't observed anything close to thermal throttling - the model execution time is so fast the card never even gets close to thermal limits. On GTX 1070s all the way up to RTX 4090s we don't even see the fans start up (which is pretty neat). |
Beta Was this translation helpful? Give feedback.
-
Hello,
As I'm experimenting with the newly arrived Tesla P4 GPU board, I find that my system is stable as long as I don't use the card with willow.
But if I use
run.sh
with its default parameters, I can do one or two WebRTC recordings and then either the docker image is not responding to anything, or the entire system freezes without anything in the system log.Here is what WIS is giving me before the crash/freeze:
Could it come from the use of "large" model? If yes, would using a
custom_settings.py
file be good enough to select the model?Or could it come from a faulty GPU card?
Thanks for any help.
Beta Was this translation helpful? Give feedback.
All reactions