Skip to content

pedrov718/Bianary-Hate-Speech-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hate Speech/Profanity Detection Project

The Problem:

alt text

Social media has become a vital part of our modern society, from political elections to fashion and education, social media like Twitter, Reddit, Instagram, and Meta have truly become all-encompassing. However for many of the great benefits that social media provides there are some terrible consequences. Cyberbullying, racism/hate speech, and profanity run rampant. Social media has become like the wild wild west, with law-less entities wantonly posting content without concern or care for others.

The Data:

Using a dataset collected by UC Berkley researcher I was able to get a human labled dataset with 135553 text samples, each labled as hate speech or not hatespeech. More information about the dataset and methods to how to download it can be found here: https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech

10_most_frequnet_words_withstops

alt text

The Solution:

Keep safe spaces safe!

As much as we would like to prevent the future of George Orwell's big brother from ever becoming reality we must accept the fact that freedom of speech/expression can lead to dark places. Thus, technologies must be created to limit hate speech, cyberbullying, and harassment in spaces where it does not belong. Training a machine learning algorithm I was able to detect the presence of hate speech and profanity in tweets containing text. Leveraging the power of Natural Langauge Processing (NLP) I was able to train a model to detect hate speech with an accuracy of 87.65%. But accuracy score was not the metric that I was most concerned with. I wanted to limit hatespeech in places where it didnt belong, thus I wanted to catch as much hate speech as posssible. So by hyperparamter tuning my model with recall in mind I was able to get a recall score of 88.59%.

Hate speech classification LinearSVM Slide

Some of my most important features for my model were the following words:

top20_feature_importance_bargraph

top20_feature_importance

For more information you can view my presentation here: https://docs.google.com/presentation/d/16TsQ3jTNRKx3UbBn3JQ7t2YNRoeXbVfK08hsS6EBVuQ/edit?usp=sharing

The implementation:

Monitor social medias/online forums and censor hate speech and profanity in real-time. Adjust custom thresholds depending on your use case. For example, allowing profanity to exist in adult oriented content and limiting it altogether in spaces primarily used by children.

Future Work

Create a Reddit/twitch bot that can censor hate speech in real-time. Replace hate-speech/ profanity with an auto-generated summary of the text (to convey meaning without negative sentiment)

Releases

No releases published

Packages

No packages published