This project aims to create an API that can scan and convert important data (NIK, Name, Place and Date of Birth) from a KTP image into text using PyTesseract Optical Character Recognition (OCR). In addition there is also a deep learning (YOLO) based KTP detector that can automatically crop the KTP image to improve OCR's ability to read text on images (this feature is still very imperfect and requires further development). Thanks to the developers who have developed most of the contents of this system before.
- Flask
pip install flask
- Numpy
pip install numpy
- OpenCV
pip install opencv-python
- Pandas
pip install pandas
- PIL
pip install pillow
- PyTesseract
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-ind
pip install pytesseract
- TextDistance
pip install textdistance
Create virtual environment:
python -m venv env
Run virtual environment (for Windows user):
env\Scripts\activate.bat
Run virtual environment (for Unix user):
source ./env/bin/activate
To run the program, use the command below:
export FLASK_APP=app
flask run
or alternatively using this command:
python app.py
Parameter | Data Type | Mandatory | Notes |
---|---|---|---|
image | Image Files | M | Foto KTP |
Parameter | Description |
---|---|
nik | NIK dari hasil OCR |
nama | Nama dari hasil OCR |
tempat_lahir | Nama tempat lahir dari hasil OCR |
tgl_lahir | Tanggal lahir dari hasil OCR (DD-MM-YYYY) |
time_elapsed | Waktu yang pemrosesan yang dibutuhkan (detik) |
{
"error": false,
"message": "Proses OCR Berhasil",
"result": {
"nik": "1234567890123456",
"nama": "DENNY SEPTIAN",
"tempat_lahir": "JAKARTA",
"tgl_lahir": "10-10-1999",
"jenis_kelamin": "LAKI-LAKI",
"agama": "ISLAM",
"status_perkawinan": "BELUM KAWIN",
"pekerjaan": "PELAJAR/MAHASISWA",
"kewarganegaraan": "WNI",
"alamat": {
"name": "DUSUN 1 OGAN 5",
"rt_rw": "001/002",
"kel_desa": "SUNGAI ARE",
"kecamatan": "ALANG-ALANG LEBAR",
"kabupaten": "OGAN ILIR",
"provinsi": "SUMATERA SELATAN"
},
"time_elapsed": "6.306"
}
}
- Create new folder, data/cnn
- Run the program