Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 744 Bytes

README.md

File metadata and controls

15 lines (8 loc) · 744 Bytes

Gutenberg-Digital-Books

This project focuses on text Classification that works on identifying different authors writing styles in Gutenberg digital books and predict to which author or genre the piece of writing belongs.

image

Prepare and Preprocess the data which include Clean Data, Feature Extraction, and Transform the data to Bag of Words (BOW) and TF-IDF Vectorizer with and without N-Grams.

for the classification Decision Tree Model, SVM Model, KNN Model were tested. The SVM model gave the best accuracy with the "linear" kernel we obtained an accuracy of 98.7%, the linear kernel provides faster performance.

Collected all the accuracy we obtained from each model:

image