Social media has become a vital part of our modern society, from political elections to fashion and education, social media like Twitter, Reddit, Instagram, and Meta have truly become all-encompassing. However for many of the great benefits that social media provides there are some terrible consequences. Cyberbullying, racism/hate speech, and profanity run rampant. Social media has become like the wild wild west, with law-less entities wantonly posting content without concern or care for others.
Using a dataset collected by UC Berkley researcher I was able to get a human labled dataset with 135553 text samples, each labled as hate speech or not hatespeech. More information about the dataset and methods to how to download it can be found here: https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech
As much as we would like to prevent the future of George Orwell's big brother from ever becoming reality we must accept the fact that freedom of speech/expression can lead to dark places. Thus, technologies must be created to limit hate speech, cyberbullying, and harassment in spaces where it does not belong. Training a machine learning algorithm I was able to detect the presence of hate speech and profanity in tweets containing text. Leveraging the power of Natural Langauge Processing (NLP) I was able to train a model to detect hate speech with an accuracy of 87.65%. But accuracy score was not the metric that I was most concerned with. I wanted to limit hatespeech in places where it didnt belong, thus I wanted to catch as much hate speech as posssible. So by hyperparamter tuning my model with recall in mind I was able to get a recall score of 88.59%.
Some of my most important features for my model were the following words:
For more information you can view my presentation here: https://docs.google.com/presentation/d/16TsQ3jTNRKx3UbBn3JQ7t2YNRoeXbVfK08hsS6EBVuQ/edit?usp=sharing
Monitor social medias/online forums and censor hate speech and profanity in real-time. Adjust custom thresholds depending on your use case. For example, allowing profanity to exist in adult oriented content and limiting it altogether in spaces primarily used by children.
Create a Reddit/twitch bot that can censor hate speech in real-time. Replace hate-speech/ profanity with an auto-generated summary of the text (to convey meaning without negative sentiment)