This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
Appearance enhancement (also known as illumination correction) is not limited to a specific degradation type and aims to restore a clean appearance similar to that obtained from a scanner or digital born PDF files.
Dataset | Num. (train/test) | Type | Example | Download |
---|---|---|---|---|
Doc3DShade | 90K | Synth | Example | Link |
DocProj | 2450 | Synth | Example | Link |
DocUNet from DocAligner | 130 | Real | Example | Link |
RealDAE | 600 (450/150) | Real | Example | Link |
Inv3D | 25K | Synth | Example | Link |
Venue | Methods | Training data | DocUNet from DocAligner (130) | RealDAE (150) | ||
---|---|---|---|---|---|---|
SSIM | PSNR | SSIM | PSNR | |||
- | - | - | 0.7195 | 13.09 | 0.8264 | 12.26 |
TOG'19 | DocProj | DocProj | 0.7098 | 14.71 | 0.8684 | 19.35 |
BMVC'20 | Das et al. | Doc3DShade | 0.7276 | 16.42 | 0.8633 | 19.87 |
MM'21 | DocTr | DocProj | 0.7067 | 15.78 | 0.7925 | 18.62 |
MM'22 | UDoc-GAN | DocProj | 0.6833 | 14.29 | 0.7558 | 16.43 |
TAI'23 | GCDRNet | RealDAE | 0.7658 | 17.09 | 0.9423 | 24.42 |
Deshadowing aims to eliminate shadows that are mainly caused by occlusion to obtain shadow-free document images.
* indicates that the implementation is unofficial.
Dataset | Num. (train/test) | Type | Example | Download |
---|---|---|---|---|
RDD | 4916 (4371/545) | Real | Example | Link |
Kligler et al. | 300 | Real | Example | Link |
FSDSRD | 14200 | Synth | Example | Link |
Jung et al. | 87 | Real | Example | Link |
OSR | 237 | Real | Example | Link |
WEZUT OCR | 176 | Real | Example | Link |
SD7K | 7620 (6479/760) | Real | Example | Link |
SynDocDS | 50K (40K/5K) | Synth | Link |
Coming soon ...
Dewarping, also referred to as geometric rectification, aims to rectify document images that suffer from curves, folds, crumples, perspective/affine deformation and other geometric distortions.
Dataset | Num. | Type | Example | Download/Codes |
---|---|---|---|---|
DocUNet | 130 | Real | Example | Link |
Doc3D | 100K | Synth | - | Link |
DIW | 5K | Real | Example | Link |
WarpDoc | 1020 | Real | Example | Link |
DIR300 | 300 | Real | Example | Link |
Inv3D | 25K | Synth | Example | Link |
DICP | - | Synth | - | Link |
DIF | - | Synth | - | Link |
Simulated Paper | 90K | Synth | - | Link |
DocReal | 200 | Real | Example | Link |
UVDoc | 20K | Synth | Link |
Venue | Method | DocUNet (130) | DIR300 (300) | ||||
---|---|---|---|---|---|---|---|
MS-SSIM↑ | LD↓ | AD↓ | MS-SSIM↑ | LD↓ | AD↓ | ||
ICCV'19 | DewarpNet | 0.474 | 8.39 | 0.426 | 0.492 | 13.94 | 0.331 |
DAS'20 | FCN-based | 0.448 | 7.84 | 0.434 | 0.503 | 9.75 | 0.331 |
ICCV'21 | Piece-Wise | 0.492 | 8.64 | 0.468 | |||
ICDAR'21 | DDCP | 0.473 | 8.99 | 0.453 | 0.552 | 10.95 | 0.357 |
MM'21 | DocTr | 0.511 | 7.76 | 0.396 | 0.616 | 7.21 | 0.254 |
CVPR'22 | RDGR | 0.497 | 8.51 | 0.461 | |||
MM'22 | Marior | 0.478 | 7.27 | 0.403 | |||
ECCV'22 | DocGeoNet | 0.504 | 7.71 | 0.380 | 0.638 | 6.40 | 0.242 |
SIGGRAPH'22 | PaperEdge | 0.473 | 7.81 | 0.392 | 0.583 | 8.00 | 0.255 |
Arxiv'22 | DocScanner-L | 0.518 | 7.45 | 0.334 | |||
ICCV'23 | Li et al. | 0.497 | 8.43 | 0.376 | 0.607 | 7.68 | 0.244 |
WACV'23 | DocReal | 0.50 | 7.03 | ||||
TCSVT'23 | DRNet | 0.51 | 7.42 | ||||
TMM'23 | DocTr++ | 0.51 | 7.54 | ||||
Arxiv'23 | Polar-Doc | 0.605 | 7.17 | 0.206 | |||
Arxiv'23 | MetaDoc | 0.502 | 7.42 | 0.315 | 0.638 | 5.75 | 0.178 |
SIGGRAPH'23 | UVDoc | 0.544 | 6.83 | 0.315 | |||
ACM TOG'23 | LA-DocFlatten | 0.526 | 6.72 | 0.300 | 0.651 | 5.70 | 0.195 |
Note that the 127th and 128th distorted images in DocUNet benchmark are rotated by 180 degrees, which do not match the ground truth documents. The performance reported here is based on corrected data.
Dataset | Num. (train/test) | Type | Example | Download |
---|---|---|---|---|
TDD (text deblur dataset) | 67.6K (66K/1.6K) | Synth | Example | Link |
Comding Soon ...
Dataset | Num. | Type | Example | Download |
---|---|---|---|---|
DocEng 2019 | 15 | Real | Example | Link |
DocEng 2020 | 32 | Real | Example | Link |
DocEng 2021 | 222 | Real | Example | Link |
DocEng 2022 | 80 | Real | Example | Link |
DIBCO 2009 | 10 | Real | Example | Link |
H-DIBCO 2010 | 10 | Real | Example | Link |
DIBCO 2011 | 16 | Real | Example | Link |
H-DIBCO 2012 | 14 | Real | Example | Link |
DIBCO 2013 | 16 | Real | Example | Link |
H-DIBCO 2014 | 10 | Real | Example | Link |
H-DIBCO 2016 | 10 | Real | Example | Link |
DIBCO 2017 | 20 | Real | Example | Link |
DIBCO 2018 | 10 | Real | Example | Link |
DIBCO 2019 | 10 | Real | Example | Link |
Bickly-diary | 7 | Real | Example | Link |
Synchromedia Multispectral (MSI) | 240 | Real | Example | Link |
Persian Heritage Image Binarization (PHIBD) | 15 | Real | Example | Link |
Palm Leaf | 50 | Real | Example | Link |
NoiseOffice | 216 | Synth | Example | Link |
LRDE Document Binarization Dataset | 125 | Real | - | Link |
Shipping label dataset | 1082 | Real | Example | Link |
Coming Soon ...