This repository hosts a Cyberbullying Detection App tailored for the Arabic language. The app is designed to identify instances of cyberbullying in Arabic text using various machine learning and deep learning algorithms. The models are fine-tuned for natural language processing and classification tasks, aiming to differentiate between bullying and non-bullying content. The project involves preprocessing, feature representation, model training, and evaluation.
- Random Forest (RF)
- Support Vector Machine (SVM)
- Multinomial Naive Bayes (NB)
- Decision Tree (DT)
-
Preprocessing: Ensure the data is suitable for training and evaluation through text tokenization, stemming or lemmatization, and elimination of stop words and unnecessary letters.
-
Feature Representation: Utilize various methods for classic algorithms (RF, SVM, DT, NB) and deep learning models (BERT, RNN, ANN, CNN, BiLSTM) to represent features effectively.
-
Models:
- BERT: Bidirectional Encoder Representations from Transformers.
- RF: Random Forest.
- SVM: Support Vector Machine.
- DT: Decision Tree.
- NB: Naive Bayes.
-
Evaluation: Assess each algorithm's performance using accuracy, precision, recall, and F1-Score metrics.
The Arabic dataset categorizes online material into two classes: "Bullying" and "Not Bullying." This binary classification provides a clear differentiation between harmful and non-harmful online interactions in the Arabic language, supporting the creation of models to address and mitigate cyberbullying problems.
Algorithm | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
RF | 0.94 | 0.93 | 0.94 | 0.93 |
SVM | 0.94 | 0.94 | 0.94 | 0.93 |
NB | 0.93 | 0.93 | 0.93 | 0.92 |
DT | 0.93 | 0.92 | 0.93 | 0.92 |
To interact with the Cyberbullying Detection App, visit the live site and enter text for cyberbullying detection.
-
"عنوان خرا 😠😠😠😠😠😠😠😠👎👎👎👎👎"
- Prediction: Bullying
-
"وامبارح اشتركت بقناتك . شغلك جميل وحلو . واكثر شي بعجبني ب فيديوهاتك انك بتحسسني اني معك ."
- Prediction: Not Bullying