ISL-CSLRT (Indian Sign Language) Dataset is which is Sign language video and sentence text pair. The dataset is divided into Training (781 videos), Validation(377 videos) and Testing (468 videos) dataset.
The project is built on hybri model of 3D-CNN and LSTM. 3D CNN focuses on extracting spacial and temporal features. LSTM converts those features into text tokens.
-
Clone Repository:
git clone https://github.com/Aishbs/Signtext.git
-
Create Virtual Environment:
python -m venv myven
-
Activate Environment:
- On Windows:
myven/Scripts/activate
- On macOS/Linux:
source myven/bin/activate
- On Windows:
-
Install Required Packages:
pip install -r requirements.txt
-
Preprocess Dataset:
- Extract frames from videos:
python frame.py train_videos train_frames python frame.py val_videos val_frames python frame.py test_videos test_frames
- Extract frames from videos:
-
Train the Model:
python gesture_model_training.py -d train_frames -e val_frames -b batch_size -l learning_rate -ep epochs
Replace
batch_size
,learning_rate
, andepochs
with appropriate values for your dataset. -
Evaluate the Model:
python gesture_model_predictions.py -m model_dir -d test_frames
Provide the path to your trained model (
model_dir
) and the testing frames dataset (test_frames
). -
Run the Application:
python gestures_live_predictions.py -m model_dir
Provide the path to your best-trained model (
model_dir
).
Demo Screenshot