- 2024.11.15
- Completed the initial version of the code, converted the UVDoc model to onnx, and improved pre- and post-processing.
- 2024.12.15
- Added deblurring/shadow removal/binarization functions and models, upgraded to RapidUndistort.
This repository is used for correcting document distortion, deblurring documents, shadow removal, and document binarization. It provides multiple models and flexible task combinations, supports automatic model downloading. Original PyTorch model sources can be found in the Acknowledgments section. Quick Start Usage Suggestions Parameter Explanation Model Address
pip install rapid-undistorted
import cv2
from rapid_undistorted.inference import InferenceEngine
img_path = "img/demo.jpg"
engine = InferenceEngine()
# Distortion correction -> Shadow removal -> Deblurring (specify deblurring model)
output_img, elapse = engine(img_path, ["unwrap", "unshadow", ("unblur", "OpenCvBilateral")])
# Shadow removal -> Deblurring (specify deblurring model)
#output_img, elapse = engine(img_path, ["unshadow", ("unblur", "OpenCvBilateral")])
# Default selection of the first unblur model in the yaml configuration file
#output_img, elapse = engine(img_path, ["unshadow", "unblur"])
# Binarization as an alternative to shadow removal method
#output_img, elapse = engine(img_path, ["unwrap", "binarize", "unblur"])
print(f"doc unwrap elapse:{elapse}")
cv2.imwrite("result.png", output_img)
- For English and numeric deblurring, the NAFDPM model is better, but for Chinese text, using the OpenCV method is more suitable.
- The shadow removal model has richer functionality and better results compared to binarization, so it is not recommended to directly use the binarization method.
Supports passing a config configuration file to declare the required task types and corresponding models, as well as paths.
config_path = "configs/config.yaml"
engine = InferenceEngine(config_path)
tasks:
unwrap:
models:
- type: "UVDoc"
path:
use_cuda: false
unshadow:
models:
- type: "GCDnet"
sub_models:
- type: "GCDnet"
path:
use_cuda: false
use_dml: false
- type: "DRnet"
path:
use_cuda: false
binarize:
models:
- type: "UnetCnn"
path:
use_cuda: false
unblur:
models:
- type: "OpenCvBilateral"
path:
- type: "NAFDPM"
path:
use_cuda: false
engine(img_path, task_list)
engine(img_path, ["unwrap", "unshadow", ("unblur", "OpenCvBilateral")])
unwrap: UVDoc unshadow: GCDnet unblur: NAFDPM binarize: UnetCnn
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
If you have other good suggestions or integration scenarios, the author will actively respond and support them.
This project is licensed under the Apache 2.0 license.