H20上新版本magic-pdf不可用 #1293

wzl0329 · 2024-12-14T08:48:37Z

Description of the bug | 错误描述

在H20上使用magic-pdf 0.9.x 和0.10.x会报错，报错信息如下。同样环境下，magic-pdf==0.8.1是没有问题的。

2024-12-11 14:34:18.129 | INFO | magic_pdf.model.pdf_extract_kit:call:184 - layout detection time: 0.33
2024-12-11 14:34:18.175 | INFO | magic_pdf.model.pdf_extract_kit:call:192 - mfd time: 0.04
2024-12-11 14:34:18.176 | INFO | magic_pdf.model.pdf_extract_kit:call:199 - formula nums: 0, mfr time: 0.0
2024-12-11 14:34:19.285 | INFO | magic_pdf.model.pdf_extract_kit:call:230 - ocr time: 1.11
2024-12-11 14:34:19.286 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:168 - -----page_id : 2, page total time: 1.48-----
2024-12-11 14:34:19.672 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:178 - gc time: 0.39
2024-12-11 14:34:19.673 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:182 - doc analyze time: 12.32, speed: 0.24 pages/second

C++ Traceback (most recent call last):

0 at::_ops::linear::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&)
1 at::native::linear(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&)
2 at::_ops::addmm::call(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::Scalar const&, c10::Scalar const&)
3 at::_ops::addmm::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::Scalar const&, c10::Scalar const&)

Error Message Summary:

FatalError: Erroneous arithmetic operation is detected by the operating system.
[TimeInfo: *** Aborted at 1733898861 (unix time) try "date -d @1733898861" if you are using GNU date ***]
[SignalInfo: *** SIGFPE (@0x7522f18fd914) received by PID 41 (TID 0x752433404480) from PID 18446744073467320596 ***]

Floating point exception (core dumped)

How to reproduce the bug | 如何复现

情况与#908 类似，但测试过paddle的cpu、gpu等多个版本，都失败了
测试情况：

显卡	宿主机nvcc -V	docker镜像nvidia/cuda	magic-pdf	paddlepaddle	paddlepaddle-gpu	结论
H20	V12.2.91	12.2.2-devel-ubuntu22.04	0.8.1	3.0.0b1		可以运行
H20	V12.2.91	12.2.2-devel-ubuntu22.04	0.9.0	2.6.2		Floating point exception (core dumped)
H20	V12.2.91	12.2.2-devel-ubuntu22.04	0.9.0	3.0.0b1		Floating point exception (core dumped)
H20	V12.2.91	12.2.2-devel-ubuntu22.04	0.10.5	3.0.0b1		Floating point exception (core dumped)
H20	V12.2.91	12.3.2-cudnn9-devel-ubuntu22.04	0.10.5	2.6.2		Floating point exception (core dumped)
H20	V12.2.91	12.3.2-cudnn9-devel-ubuntu22.04	0.10.5	3.0.0b1		Floating point exception (core dumped)
H20	V12.2.91	12.3.2-cudnn9-devel-ubuntu22.04	0.10.5		2.6.2	Floating point exception (core dumped)
H20	V12.2.91	12.3.2-cudnn9-devel-ubuntu22.04	0.10.5		3.0.0b1	和torch 2.3.1不兼容

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.10.x

Device mode | 设备模式

cuda

The text was updated successfully, but these errors were encountered:

wzl0329 · 2024-12-18T01:51:42Z

@myhloli 您好，这个问题有结论吗，能否提供一个在H20上可用的环境版本

myhloli · 2024-12-18T02:16:07Z

抱歉，我们没有H系列显卡进行测试，目前只能参考 #558 的案例使用高版本cuda的paddlegpu尝试，如果仍有兼容性问题，请卸载paddlepaddle和paddlepaddle-gpu，并重新安装paddlepaddle使用cpu进行推理

wzl0329 · 2024-12-18T03:14:33Z

我们一开始就是使用的paddle cpu版本，cpu版本报错才去尝试paddlegpu的。而高版本cuda的paddlegpu（cuda12.3）会和torch2.3.1版本不兼容，其他库又依赖2.3.1，所以导致高cuda的paddlegpu装不上。总之尝试了paddle的cpu、cuda各版本，都没成功

wzl0329 · 2024-12-18T03:16:19Z

我们一开始就是使用的paddle cpu版本，cpu版本报错才去尝试paddlegpu的。而高版本cuda的paddlegpu（cuda12.3）会和torch2.3.1版本不兼容，其他库又依赖2.3.1，所以导致高cuda的paddlegpu装不上。总之尝试了paddle的cpu、cuda各版本，都没成功

#558 (comment) #558的评论区也有人遇到这个问题

myhloli · 2024-12-18T03:20:50Z

cpu版本不应该不兼容吧，根据用户反馈，cpu不兼容的情况一般是cpu不支持avx/avx2指令集，你也可以通过这个点查一下

myhloli · 2024-12-18T03:23:26Z

我们一开始就是使用的paddle cpu版本，cpu版本报错才去尝试paddlegpu的。而高版本cuda的paddlegpu（cuda12.3）会和torch2.3.1版本不兼容，其他库又依赖2.3.1，所以导致高cuda的paddlegpu装不上。总之尝试了paddle的cpu、cuda各版本，都没成功

可以将unimernet更新到0.2.2，移除了对torchtext的依赖，这样就可以更新torch到2.3.1以上，如果提示其他包对torch 版本限制，可以先不管，直接手动强制更新torch（需要同步更新torchvision到匹配版本

wzl0329 · 2024-12-18T03:24:42Z

好的，谢谢建议~我们尝试一下

wzl0329 added the bug Something isn't working label Dec 14, 2024

wzl0329 closed this as completed Dec 18, 2024

wzl0329 reopened this Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

H20上新版本magic-pdf不可用 #1293

H20上新版本magic-pdf不可用 #1293

wzl0329 commented Dec 14, 2024 •

edited

Loading

wzl0329 commented Dec 18, 2024

myhloli commented Dec 18, 2024

wzl0329 commented Dec 18, 2024

wzl0329 commented Dec 18, 2024

myhloli commented Dec 18, 2024

myhloli commented Dec 18, 2024

wzl0329 commented Dec 18, 2024

H20上新版本magic-pdf不可用 #1293

H20上新版本magic-pdf不可用 #1293

Comments

wzl0329 commented Dec 14, 2024 • edited Loading

Description of the bug | 错误描述

C++ Traceback (most recent call last):

Error Message Summary:

How to reproduce the bug | 如何复现

Operating system | 操作系统

Python version | Python 版本

Software version | 软件版本 (magic-pdf --version)

Device mode | 设备模式

wzl0329 commented Dec 18, 2024

myhloli commented Dec 18, 2024

wzl0329 commented Dec 18, 2024

wzl0329 commented Dec 18, 2024

myhloli commented Dec 18, 2024

myhloli commented Dec 18, 2024

wzl0329 commented Dec 18, 2024

wzl0329 commented Dec 14, 2024 •

edited

Loading