Skip to content

[BUG] Failure to correctly determine GPU-utilizing process details - "No such process" #152

@eyalroz

Description

@eyalroz

Required prerequisites

  • I have read the documentation https://nvitop.readthedocs.io.
  • I have searched the Issue Tracker that this hasn't already been reported. (comment there if it has.)
  • I have tried the latest version of nvitop in a new isolated virtual environment.

What version of nvitop are you using?

1.0.0

Operating system and version

SUSE GNU/Linux 15 SP1 x86_64

NVIDIA driver version

535.54.03

NVIDIA-SMI

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro RTX 6000                Off | 00000000:15:00.0 Off |                  Off |
| 33%   32C    P8              25W / 260W |    164MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Quadro RTX 6000                Off | 00000000:2D:00.0 Off |                  Off |
| 33%   38C    P8              32W / 260W |      4MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  Quadro P620                    Off | 00000000:99:00.0 Off |                  N/A |
| 34%   28C    P8              N/A /  N/A |     98MiB /  2048MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     15851      C   ...ake-build-debug/foo/bar                  160MiB |
|    2   N/A  N/A     18243      G   /usr/bin/X                                   64MiB |
|    2   N/A  N/A     22720      G   /usr/bin/gnome-shell                         28MiB |
+---------------------------------------------------------------------------------------+

Python environment

Installed using:

pip3 install --upgrade nvitop

(Couldn't insteall with git+https://github.com/XuehaiPan/nvitop.git#egg=nvitop, that triggers a different error.)

After installation,

$ python3 -m pip freeze | python3 -c 'import sys; print(sys.version, sys.platform); print("".join(filter(lambda s: any(word in s.lower() for word in ("nvi", "cuda", "nvml", "gpu")), sys.stdin)))'
3.6.5 (default, Apr 05 2018, 13:30:06) [GCC] linux
gpustat==1.1.1
nvidia-ml-py==11.525.150
nvitop==1.0.0
PyJSONViewer==1.6.0

Problem description

nvidia-smi identifies 3 processes using GPUs, with the third being gnome-shell. The first two processes are identified by nvitop and listed appropriately, but for the third process, I get:

... snip ...
╒═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│ Processes:                                                                                                                                           joeuser1@mymachine │
│ GPU     PID      USER  GPU-MEM %SM  %CPU  %MEM       TIME  COMMAND                                                                                                      │
... snip...
│   2       0 G     N/A    22KiB   0   N/A   N/A        N/A  No Such Process                                                                                              │
╘═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╛

Steps to Reproduce

Just ran nvitop.

Traceback

No error reported.

Logs


Expected behavior

I should see the process information, including the path and the PID, which nvidia-smi reports - for the third process as well.

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingpynvmlSomething related to the `nvidia-ml-py` packageupstreamSomething upstream related

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions