Skip to content

A prototype that I developed in March 2017, meant for automatic content filtering. Requires some refactoring and tests.

License

Notifications You must be signed in to change notification settings

AlexandruBurlacu/hate-speech-detection-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hate Speech Detection Service

Using a hybrid algorithm (Word2Vec + Logistic Regression and N-grams string matching) I have developed a prototype of a service that being queried via HTTP with a text can predict if the text contains hate speech or not.

Setup

virtualenv .venv -p python3 # project requires python 3.5+
source .venv/bin/activate
pip install -r requirements
cd estimator/resources/google_word2vec_model
wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"
cd ../../
./run
# if something doesn't work, check the config_estimator.json file and make sure the paths are set up corectly

You might need to find a way to slim the word2vec pre-trained model, because otherwise it will eat up huge ammouns of RAM. One way to do it, is to limit the number of it's parameters. It can be easily done via the config_estimator.json file.

About

A prototype that I developed in March 2017, meant for automatic content filtering. Requires some refactoring and tests.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published