Gutenberg-Digital-Books

This project focuses on text Classification that works on identifying different authors writing styles in Gutenberg digital books and predict to which author or genre the piece of writing belongs.

Prepare and Preprocess the data which include Clean Data, Feature Extraction, and Transform the data to Bag of Words (BOW) and TF-IDF Vectorizer with and without N-Grams.

for the classification Decision Tree Model, SVM Model, KNN Model were tested. The SVM model gave the best accuracy with the "linear" kernel we obtained an accuracy of 98.7%, the linear kernel provides faster performance.

Collected all the accuracy we obtained from each model:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Gutenberg-Digital-Books

Files

README.md

Latest commit

History

README.md

File metadata and controls

Gutenberg-Digital-Books