The Text Emotion Classifier is a machine learning project that aims to detect and classify emotions in text data. The model uses deep learning techniques to process textual input and identify the underlying emotion, such as happiness, sadness, anger, or surprise. This project can be used for sentiment analysis, social media monitoring, customer feedback processing, and more.
The dataset used in this project is a labeled text dataset where each entry consists of a text sample and its corresponding emotion label. The dataset is stored in a train.txt
file with the following structure:
- Text: The input text string.
- Emotions: The associated emotion label (e.g., "happy", "sad", "angry", etc).
- Dataset link: Kaggle link
An example of the data structure:
I am feeling great today!;happy
This is so frustrating and annoying.;angry
The data file train.txt
should be placed in the root directory of the project.
To run this project locally, you'll need to have Python installed. Follow these steps to set up the environment:
- Clone the Repository:
git clone https://github.com/Vaibhav-kesarwani/Text_Emotion_Classifier.git
cd Text_Emotion_Classifier
- Create a Virtual Environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install Required Packages: Install the dependencies by running:
pip install -r requirements.txt
To run the Text Emotion Classifier, follow these steps:
-
Prepare the Dataset: Ensure that your
train.txt
file is in the root directory. This file should contain the text data and corresponding emotion labels, separated by a semicolon(;)
. -
Run the Script: Execute the main script to load the data and perform emotion classification:
python main.ipynb
- Output: The script will print the first few rows of the dataset to the console, showing the text samples and their associated emotion labels.
The model training is performed within the main.ipynb
script, which processes the text data, tokenizes it, and trains a Sequential model using Keras. You can modify the model architecture, training parameters, or the data processing steps within this script.
# Define the model
model = Sequential()
model.add(Embedding(input_dim = len(tokenizer.word_index) + 1, output_dim=128, input_length=max_length))
model.add(Flatten())
model.add(Dense(units=128, activation="relu"))
model.add(Dense(units=len(one_hot_labels[0]), activation="softmax"))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(xtrain, ytrain, epochs=10, batch_size=32, validation_data=(xtest, ytest))
After training the model, you can use it to predict emotions from new text inputs. Implement the prediction logic in a separate script or extend main.ipynb
to include a prediction function.
Here is an overview of the project directory structure:
Text_Emotion_Classifier/
โ
โโโ val.txt # This the previous version of the test data set
โโโ test.txt # Test Data set in this file for the train.txt
โโโ train.txt # The dataset file containing text and emotion labels
โโโ main.py # Main script to run the emotion classifier
โโโ requirements.txt # List of dependencies
โโโ README.md # Project documentation
โโโ LICENSE # Project license
Contributions are welcome! If you'd like to contribute to this project, please follow these steps:
- Fork the repository & Star the repository
- Create a new branch (git checkout -b feature)
- Make your changes
- Commit your changes (git commit -am 'Add new feature')
- Push to the branch (git push origin feature)
- Create a new Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
If you have any questions or suggestions, feel free to reach out to me at :