The present system analyses a train dataset of labelled (Positive / Negative) Rotten Tomatoes reviews through a Naive Bayes approach, and attempts to classify the sentiment of the reviews in three test datasets:
- The dataset of Rotten Tomatoes' reviews used for training;
- A test dataset of Rotten Tomatoes' reviews;
- A test dataset of Nokia reviews (hence, a different domain).
For comparison, there is also a Rule-based approach.
These two approaches were already implemented by Dr. Chenghua Lin for the course of Data Mining & Visualization in the University of Aberdeen. The authors analyzed these approaches thoroughly in a report format, and implemented an improved version of the Rule-based approach based in sentence polarity.
To start, clone the present repository into your local machine. If you're unaware of how to achieve this, please become familiar with the mechanisms of GitHub repositories.
git clone [email protected]:thyriki/NLPSentimentAnalysis.git
Ensure that you have Python 3.6 installed and properly set up.
Navigate to the project's folder, and run the following command:
python SentimentPract2.py
For any inquiries, feel free to open up an issue.