Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

是否支持单显卡MIG #1340

Open
nic648 opened this issue Dec 20, 2024 · 0 comments
Open

是否支持单显卡MIG #1340

nic648 opened this issue Dec 20, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@nic648
Copy link

nic648 commented Dec 20, 2024

有一块A100,80G的显卡
当前已经MIG了

Fri Dec 20 18:39:26 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08 Driver Version: 550.127.08 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:08:00.0 Off | On |
| N/A 56C P0 68W / 300W | 63MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 0 2 0 0 | 38MiB / 40192MiB | 14 0 | 3 0 2 0 0 |
| | 0MiB / 65535MiB | | |
+------------------+ +-----------+-----------------------+
| 0 2 1 1 | | 14 0 | 3 0 2 0 0 |
| | | | |
+------------------+ +-----------+-----------------------+
| 0 2 2 2 | | 14 0 | 3 0 2 0 0 |
| | | | |
+------------------+----------------------------------+-----------+-----------------------+
| 0 3 0 3 | 13MiB / 19968MiB | 14 0 | 1 0 1 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 0 4 0 4 | 13MiB / 19968MiB | 14 0 | 1 0 1 0 0 |
| | 0MiB / 32767MiB | | |
+------------------+----------------------------------+-----------+-----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

问题一:
#写成mig-id,"cuda:2" 显示未找到
magic-pdf.config配置文件是否支持MIG
{
"bucket_info": {
"bucket-name-1": [
"ak",
"sk",
"endpoint"
],
"bucket-name-2": [
"ak",
"sk",
"endpoint"
]
},
"models-dir": "/root/.cache/modelscope/hub/opendatalab/PDF-Extract-Kit-1___0/models",
"layoutreader-model-dir": "/root/.cache/modelscope/hub/ppaanngggg/layoutreader",
"device-mode": "cuda", #写成"cuda:2" 显示未找到
"layout-config": {
"model": "layoutlmv3"
},
"formula-config": {
"mfd_model": "yolo_v8_mfd",
"mfr_model": "unimernet_small",
"enable": true
},
"table-config": {
"model": "rapid_table",
"enable": true,
"max_time": 400
},
"config_version": "1.0.0"
}

问题二:
多GPU模式
server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0,1,2,3,4], timeout=False)
server.run(port=8080)
Model initialization complete!
Process SpawnProcess-2:
Traceback (most recent call last):
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/managers.py", line 810, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/root/anaconda3/envs/MinerU/lib/python3.10/site-packages/litserve/loops.py", line 417, in inference_worker
if workers_setup_status:
File "", line 2, in len
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/managers.py", line 814, in _callmethod
self._connect()
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/managers.py", line 801, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/connection.py", line 502, in Client
c = SocketClient(address)
File "/root/anaconda3/envs/MinerU/lib/python3.10/multiprocessing/connection.py", line 630, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory

@nic648 nic648 added the enhancement New feature or request label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant