We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When processing a specific file I get this error:
2024-12-11 09:56:54.076 | INFO | magic_pdf.model.pdf_extract_kit:call:226 - -----page total time: 1.12----- 2024-12-11 09:56:54.573 | INFO | magic_pdf.model.pdf_extract_kit:call:153 - layout detection time: 0.5 2024-12-11 09:56:54.690 | INFO | magic_pdf.model.pdf_extract_kit:call:161 - mfd time: 0.11 2024-12-11 09:56:54.691 | INFO | magic_pdf.model.pdf_extract_kit:call:168 - formula nums: 0, mfr time: 0.0 2024-12-11 09:56:54.691 | INFO | magic_pdf.model.pdf_extract_kit:call:194 - ocr time: 0.0 2024-12-11 09:56:55.762 | ERROR | main:pdf_parse_main:83 - zip() argument after * must be an iterable, not NoneType Traceback (most recent call last):
File "C:\IA\MinerU\processing.py", line 88, in pdf_parse_main( └ <function pdf_parse_main at 0x00000193ADFB3E20>
File "C:\IA\MinerU\processing.py", line 57, in pdf_parse_main pipe.pipe_analyze() # Document analysis │ └ <function UNIPipe.pipe_analyze at 0x00000193E2BA1870> └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 37, in pipe_analyze self.model_list = doc_analyze(self.pdf_bytes, ocr=True, │ │ │ │ └ b'%PDF-1.7\n%\xbf\xf7\xa2\xfe\n1 0 obj\n<< /Metadata 30 0 R /Pages 31 0 R /Type /Catalog >>\nendobj\n2 0 obj\n<< /Type /ObjSt... │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070> │ │ └ <function doc_analyze at 0x00000193E241C670> │ └ [] └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 166, in doc_analyze result = custom_model(img) │ └ array([[[255, 255, 255], │ [255, 255, 255], │ [255, 255, 255], │ ..., │ [255, 255, 255], │ [255... └ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x00000193E2C384C0>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 211, in call html_code, table_cell_bboxes, elapse = self.table_model.predict(new_image) │ │ │ │ └ <PIL.Image.Image image mode=RGB size=1398x2008 at 0x19395916920> │ │ │ └ <function RapidTableModel.predict at 0x0000019395762DD0> │ │ └ <magic_pdf.model.sub_modules.table.rapidtable.rapid_table.RapidTableModel object at 0x000001939627FA60> │ └ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x00000193E2C384C0> └ None
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\sub_modules\table\rapidtable\rapid_table.py", line 13, in predict html_code, table_cell_bboxes, elapse = self.table_model(np.asarray(image), ocr_result) │ │ │ │ │ └ None │ │ │ │ └ <PIL.Image.Image image mode=RGB size=1398x2008 at 0x19395916920> │ │ │ └ │ │ └ <module 'numpy' from 'C:\IA\MinerU\env_2\lib\site-packages\numpy\init.py'> │ └ <rapid_table.main.RapidTable object at 0x00000193AD23EA10> └ <magic_pdf.model.sub_modules.table.rapidtable.rapid_table.RapidTableModel object at 0x000001939627FA60>
File "C:\IA\MinerU\env_2\lib\site-packages\rapid_table\main.py", line 55, in call dt_boxes, rec_res = self.get_boxes_recs(ocr_result, h, w) │ │ │ │ └ 1398 │ │ │ └ 2008 │ │ └ None │ └ <function RapidTable.get_boxes_recs at 0x0000019395746DD0> └ <rapid_table.main.RapidTable object at 0x00000193AD23EA10>
File "C:\IA\MinerU\env_2\lib\site-packages\rapid_table\main.py", line 69, in get_boxes_recs dt_boxes, rec_res, scores = list(zip(*ocr_result)) └ None
TypeError: zip() argument after * must be an iterable, not NoneType
Windows
3.10
0.9.x
cuda
The text was updated successfully, but these errors were encountered:
Can you upload the sample file?
Sorry, something went wrong.
No branches or pull requests
Description of the bug | 错误描述
When processing a specific file I get this error:
2024-12-11 09:56:54.076 | INFO | magic_pdf.model.pdf_extract_kit:call:226 - -----page total time: 1.12-----
2024-12-11 09:56:54.573 | INFO | magic_pdf.model.pdf_extract_kit:call:153 - layout detection time: 0.5
2024-12-11 09:56:54.690 | INFO | magic_pdf.model.pdf_extract_kit:call:161 - mfd time: 0.11
2024-12-11 09:56:54.691 | INFO | magic_pdf.model.pdf_extract_kit:call:168 - formula nums: 0, mfr time: 0.0
2024-12-11 09:56:54.691 | INFO | magic_pdf.model.pdf_extract_kit:call:194 - ocr time: 0.0
2024-12-11 09:56:55.762 | ERROR | main:pdf_parse_main:83 - zip() argument after * must be an iterable, not NoneType
Traceback (most recent call last):
File "C:\IA\MinerU\processing.py", line 88, in
pdf_parse_main(
└ <function pdf_parse_main at 0x00000193ADFB3E20>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 37, in pipe_analyze
self.model_list = doc_analyze(self.pdf_bytes, ocr=True,
│ │ │ │ └ b'%PDF-1.7\n%\xbf\xf7\xa2\xfe\n1 0 obj\n<< /Metadata 30 0 R /Pages 31 0 R /Type /Catalog >>\nendobj\n2 0 obj\n<< /Type /ObjSt...
│ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070>
│ │ └ <function doc_analyze at 0x00000193E241C670>
│ └ []
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 166, in doc_analyze
result = custom_model(img)
│ └ array([[[255, 255, 255],
│ [255, 255, 255],
│ [255, 255, 255],
│ ...,
│ [255, 255, 255],
│ [255...
└ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x00000193E2C384C0>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 211, in call
html_code, table_cell_bboxes, elapse = self.table_model.predict(new_image)
│ │ │ │ └ <PIL.Image.Image image mode=RGB size=1398x2008 at 0x19395916920>
│ │ │ └ <function RapidTableModel.predict at 0x0000019395762DD0>
│ │ └ <magic_pdf.model.sub_modules.table.rapidtable.rapid_table.RapidTableModel object at 0x000001939627FA60>
│ └ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x00000193E2C384C0>
└ None
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\sub_modules\table\rapidtable\rapid_table.py", line 13, in predict
html_code, table_cell_bboxes, elapse = self.table_model(np.asarray(image), ocr_result)
│ │ │ │ │ └ None
│ │ │ │ └ <PIL.Image.Image image mode=RGB size=1398x2008 at 0x19395916920>
│ │ │ └
│ │ └ <module 'numpy' from 'C:\IA\MinerU\env_2\lib\site-packages\numpy\init.py'>
│ └ <rapid_table.main.RapidTable object at 0x00000193AD23EA10>
└ <magic_pdf.model.sub_modules.table.rapidtable.rapid_table.RapidTableModel object at 0x000001939627FA60>
File "C:\IA\MinerU\env_2\lib\site-packages\rapid_table\main.py", line 55, in call
dt_boxes, rec_res = self.get_boxes_recs(ocr_result, h, w)
│ │ │ │ └ 1398
│ │ │ └ 2008
│ │ └ None
│ └ <function RapidTable.get_boxes_recs at 0x0000019395746DD0>
└ <rapid_table.main.RapidTable object at 0x00000193AD23EA10>
File "C:\IA\MinerU\env_2\lib\site-packages\rapid_table\main.py", line 69, in get_boxes_recs
dt_boxes, rec_res, scores = list(zip(*ocr_result))
└ None
TypeError: zip() argument after * must be an iterable, not NoneType
How to reproduce the bug | 如何复现
When processing a specific file I get this error:
2024-12-11 09:56:54.076 | INFO | magic_pdf.model.pdf_extract_kit:call:226 - -----page total time: 1.12-----
2024-12-11 09:56:54.573 | INFO | magic_pdf.model.pdf_extract_kit:call:153 - layout detection time: 0.5
2024-12-11 09:56:54.690 | INFO | magic_pdf.model.pdf_extract_kit:call:161 - mfd time: 0.11
2024-12-11 09:56:54.691 | INFO | magic_pdf.model.pdf_extract_kit:call:168 - formula nums: 0, mfr time: 0.0
2024-12-11 09:56:54.691 | INFO | magic_pdf.model.pdf_extract_kit:call:194 - ocr time: 0.0
2024-12-11 09:56:55.762 | ERROR | main:pdf_parse_main:83 - zip() argument after * must be an iterable, not NoneType
Traceback (most recent call last):
File "C:\IA\MinerU\processing.py", line 88, in
pdf_parse_main(
└ <function pdf_parse_main at 0x00000193ADFB3E20>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 37, in pipe_analyze
self.model_list = doc_analyze(self.pdf_bytes, ocr=True,
│ │ │ │ └ b'%PDF-1.7\n%\xbf\xf7\xa2\xfe\n1 0 obj\n<< /Metadata 30 0 R /Pages 31 0 R /Type /Catalog >>\nendobj\n2 0 obj\n<< /Type /ObjSt...
│ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070>
│ │ └ <function doc_analyze at 0x00000193E241C670>
│ └ []
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x00000193E2C70070>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 166, in doc_analyze
result = custom_model(img)
│ └ array([[[255, 255, 255],
│ [255, 255, 255],
│ [255, 255, 255],
│ ...,
│ [255, 255, 255],
│ [255...
└ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x00000193E2C384C0>
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 211, in call
html_code, table_cell_bboxes, elapse = self.table_model.predict(new_image)
│ │ │ │ └ <PIL.Image.Image image mode=RGB size=1398x2008 at 0x19395916920>
│ │ │ └ <function RapidTableModel.predict at 0x0000019395762DD0>
│ │ └ <magic_pdf.model.sub_modules.table.rapidtable.rapid_table.RapidTableModel object at 0x000001939627FA60>
│ └ <magic_pdf.model.pdf_extract_kit.CustomPEKModel object at 0x00000193E2C384C0>
└ None
File "C:\IA\MinerU\env_2\lib\site-packages\magic_pdf\model\sub_modules\table\rapidtable\rapid_table.py", line 13, in predict
html_code, table_cell_bboxes, elapse = self.table_model(np.asarray(image), ocr_result)
│ │ │ │ │ └ None
│ │ │ │ └ <PIL.Image.Image image mode=RGB size=1398x2008 at 0x19395916920>
│ │ │ └
│ │ └ <module 'numpy' from 'C:\IA\MinerU\env_2\lib\site-packages\numpy\init.py'>
│ └ <rapid_table.main.RapidTable object at 0x00000193AD23EA10>
└ <magic_pdf.model.sub_modules.table.rapidtable.rapid_table.RapidTableModel object at 0x000001939627FA60>
File "C:\IA\MinerU\env_2\lib\site-packages\rapid_table\main.py", line 55, in call
dt_boxes, rec_res = self.get_boxes_recs(ocr_result, h, w)
│ │ │ │ └ 1398
│ │ │ └ 2008
│ │ └ None
│ └ <function RapidTable.get_boxes_recs at 0x0000019395746DD0>
└ <rapid_table.main.RapidTable object at 0x00000193AD23EA10>
File "C:\IA\MinerU\env_2\lib\site-packages\rapid_table\main.py", line 69, in get_boxes_recs
dt_boxes, rec_res, scores = list(zip(*ocr_result))
└ None
TypeError: zip() argument after * must be an iterable, not NoneType
Operating system | 操作系统
Windows
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.9.x
Device mode | 设备模式
cuda
The text was updated successfully, but these errors were encountered: