📊RapidTableDetection

Recent Updates

2024.10.15
- Completed the initial version of the code, including three modules: object detection, semantic segmentation, and corner direction recognition.
2024.11.2
- Added new YOLOv11 object detection models and edge detection models.
- Increased automatic downloading and reduced package size.
- Added ONNX-GPU inference support and provided benchmark test results.
- Added online example usage.

Introduction

💡✨ RapidTableDetection is a powerful and efficient table detection system that supports various types of tables, including those in papers, journals, magazines, invoices, receipts, and sign-in sheets.

🚀 It supports versions derived from PaddlePaddle and YOLO, with the default model combination requiring only 1.2 seconds for single-image CPU inference, and 0.4 seconds for the smallest ONNX-GPU (V100) combination, or 0.2 seconds for the PaddlePaddle-GPU version.

🛠️ It supports free combination and independent training optimization of three modules, providing ONNX conversion scripts and fine-tuning training solutions.

🌟 The whl package is easy to integrate and use, providing strong support for downstream OCR, table recognition, and data collection.

Refer to the implementation solution of the 2nd place in the Baidu Table Detection Competition, and retrain with a large amount of real-world scenario data.
The training dataset is acknowledged. The author works on open-source projects during spare time, please support by giving a star.

Usage Recommendations

Document scenarios: No perspective rotation, use only object detection.
Photography scenarios with small angle rotation (-90~90): Default top-left corner, do not use corner direction recognition.
Use the online experience to find the suitable model combination for your scenario.

Online Experience

modelscope huggingface

Effect Demonstration

Installation

Models will be automatically downloaded, or you can download them from the repository modelscope model warehouse.

pip install rapid-table-det

Parameter Explanation

Default values:

use_cuda: False: Enable GPU acceleration for inference.
obj_model_type="yolo_obj_det": Object detection model type.
edge_model_type="yolo_edge_det": Edge detection model type.
cls_model_type="paddle_cls_det": Corner direction classification model type.

Since ONNX has limited GPU acceleration, it is still recommended to directly use YOLOX or install PaddlePaddle for faster model execution (I can provide the entire process if needed). The PaddlePaddle S model, due to quantization, actually slows down and reduces accuracy, but significantly reduces model size.

`model_type`	Task Type	Training Source	Size	Single Table Inference Time (V100-16G, cuda12, cudnn9, ubuntu)
yolo_obj_det	Table Object Detection	`yolo11-l`	`100m`	`cpu:570ms, gpu:400ms`
`paddle_obj_det`	Table Object Detection	`paddle yoloe-plus-x`	`380m`	`cpu:1000ms, gpu:300ms`
`paddle_obj_det_s`	Table Object Detection	`paddle yoloe-plus-x + quantization`	`95m`	`cpu:1200ms, gpu:1000ms`
yolo_edge_det	Semantic Segmentation	`yolo11-l-segment`	`108m`	`cpu:570ms, gpu:200ms`
`yolo_edge_det_s`	Semantic Segmentation	`yolo11-s-segment`	`11m`	`cpu:260ms, gpu:200ms`
`paddle_edge_det`	Semantic Segmentation	`paddle-dbnet`	`99m`	`cpu:1200ms, gpu:120ms`
`paddle_edge_det_s`	Semantic Segmentation	`paddle-dbnet + quantization`	`25m`	`cpu:860ms, gpu:760ms`
paddle_cls_det	Direction Classification	`paddle pplcnet`	`6.5m`	`cpu:70ms, gpu:60ms`

Execution parameters:

det_accuracy=0.7
use_obj_det=True
use_edge_det=True
use_cls_det=True

Quick Start

from rapid_table_det.inference import TableDetector

img_path = f"tests/test_files/chip.jpg"
table_det = TableDetector()

result, elapse = table_det(img_path)
obj_det_elapse, edge_elapse, rotate_det_elapse = elapse
print(
    f"obj_det_elapse:{obj_det_elapse}, edge_elapse={edge_elapse}, rotate_det_elapse={rotate_det_elapse}"
)
# Output visualization
# import os
# import cv2
# from rapid_table_det.utils.visuallize import img_loader, visuallize, extract_table_img
# 
# img = img_loader(img_path)
# img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# file_name_with_ext = os.path.basename(img_path)
# file_name, file_ext = os.path.splitext(file_name_with_ext)
# out_dir = "rapid_table_det/outputs"
# if not os.path.exists(out_dir):
#     os.makedirs(out_dir)
# extract_img = img.copy()
# for i, res in enumerate(result):
#     box = res["box"]
#     lt, rt, rb, lb = res["lt"], res["rt"], res["rb"], res["lb"]
#     # With detection box and top-left corner position
#     img = visuallize(img, box, lt, rt, rb, lb)
#     # Perspective transformation to extract table image
#     wrapped_img = extract_table_img(extract_img.copy(), lt, rt, rb, lb)
#     cv2.imwrite(f"{out_dir}/{file_name}-extract-{i}.jpg", wrapped_img)
# cv2.imwrite(f"{out_dir}/{file_name}-visualize.jpg", img)

Using PaddlePaddle Version

You must download the models and specify their locations!

#(default installation is GPU version, you can override with CPU version paddlepaddle)
pip install rapid-table-det-paddle

from rapid_table_det_paddle.inference import TableDetector

img_path = f"tests/test_files/chip.jpg"

table_det = TableDetector(
    obj_model_path="models/obj_det_paddle",
    edge_model_path="models/edge_det_paddle",
    cls_model_path="models/cls_det_paddle",
    use_obj_det=True,
    use_edge_det=True,
    use_cls_det=True,
)
result, elapse = table_det(img_path)
obj_det_elapse, edge_elapse, rotate_det_elapse = elapse
print(
    f"obj_det_elapse:{obj_det_elapse}, edge_elapse={edge_elapse}, rotate_det_elapse={rotate_det_elapse}"
)
# more than one table in one image
# img = img_loader(img_path)
# file_name_with_ext = os.path.basename(img_path)
# file_name, file_ext = os.path.splitext(file_name_with_ext)
# out_dir = "rapid_table_det_paddle/outputs"
# if not os.path.exists(out_dir):
#     os.makedirs(out_dir)
# extract_img = img.copy()
# for i, res in enumerate(result):
#     box = res["box"]
#     lt, rt, rb, lb = res["lt"], res["rt"], res["rb"], res["lb"]
#     # With detection box and top-left corner position
#     img = visuallize(img, box, lt, rt, rb, lb)
#     # Perspective transformation to extract table image
#     wrapped_img = extract_table_img(extract_img.copy(), lt, rt, rb, lb)
#     cv2.imwrite(f"{out_dir}/{file_name}-extract-{i}.jpg", wrapped_img)
# cv2.imwrite(f"{out_dir}/{file_name}-visualize.jpg", img)

FAQ (Frequently Asked Questions)

Q: How to fine-tune the model for specific scenarios?
- A: Refer to this project, which provides detailed visualization steps and datasets. You can get the PaddlePaddle inference model from Baidu Table Detection Competition. For YOLOv11, use the official script, which is simple enough, and convert the data to COCO format for training as per the official guidelines.
Q: How to export ONNX?
- A: For PaddlePaddle models, use the onnx_transform.ipynb file in the tools directory of this project. For YOLOv11, follow the official method, which can be done in one line.
Q: Can distorted images be corrected?
- A: This project only handles rotation and perspective scenarios for table extraction. For distorted images, you need to correct the distortion first.

Acknowledgments

2nd Place Solution in Baidu Table Detection Competition
WTW Natural Scene Table Dataset
FinTabNet PDF Document Table Dataset
TableBank Table Dataset
TableGeneration Table Auto-Generation Tool

Contribution Guidelines

Pull requests are welcome. For major changes, please open an issue to discuss what you would like to change.

If you have other good suggestions and integration scenarios, the author will actively respond and support them.

Open Source License

This project is licensed under the Apache 2.0 open source license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_en.md

README_en.md

📊RapidTableDetection

Recent Updates

Introduction

Usage Recommendations

Online Experience

Effect Demonstration

Installation

Parameter Explanation

Quick Start

Using PaddlePaddle Version

FAQ (Frequently Asked Questions)

Acknowledgments

Contribution Guidelines

Open Source License

Files

README_en.md

Latest commit

History

README_en.md

File metadata and controls

📊RapidTableDetection

Recent Updates

Introduction

Usage Recommendations

Online Experience

Effect Demonstration

Installation

Parameter Explanation

Quick Start

Using PaddlePaddle Version

FAQ (Frequently Asked Questions)

Acknowledgments

Contribution Guidelines

Open Source License