Skip to content

Latest commit

 

History

History
16 lines (9 loc) · 416 Bytes

README.md

File metadata and controls

16 lines (9 loc) · 416 Bytes

OCR-on-PDF

This notebook can search on large and unsearchable (scanned) PDF file to find your keyword.

Supported languages: English and Persian

This notebook has been developed on Google Colab on 08 Dec 2024.

Installation:

!apt-get install -y tesseract-ocr

!apt-get install -y tesseract-ocr-fas

!pip install pytesseract pdf2image Pillow

!apt-get install -y poppler-utils