Learning Goals
Naive Bayes algorithm, text feature extraction, sklearn pipeline object.
Exercise Statement
Create an e-mail spam detection model.
Prerequisites
Basic understanding of Naive Bayes algorithm, Python and scikit-learn basics.
Data source/summary:
Dataset if obtained from Kaggle. Data consists of email messages, already labeled as spam or ham.
Link of dataset: https://www.kaggle.com/code/mfaisalqureshi/email-spam-detection-98-accuracy/data
(Optional) Suggest/Propose Solutions
I have the solution ready implemented with Multinomial Naive Bayes algorithm, will be happy to create pull request to include the exercise solution.