Please read 'requirements.txt' for details.
Demo of CV module can be found in ./notebooks/. To process files in batch, please follow the steps:
- Firstly preprocess data, run
code/cv/convert_pdf_to_jpg.py - To test the public models, please:
- Run
code/cv/download_models.jpy - Run
code/cv/parse_layout.jpy - Call functions in
code/cv/evaluate.py.notebooks/evaluate.ipynbis recommended to see the usage.
- Run
- To finetune model, please see codes in
code/cv/layout5_detectron2.ipynb. - Finally, run
code/cv/OCR.pyto extract characters, butAPI_KEYandAPI_SECRETis required.