- Python versions 3.*.
- Libraries:
- Pandas.
- matplotlib.
- seaborn.
In this project, I was curious about the factors that motivate users and people in general to read a book. Goodreads is a website and a social network where users share reviews and find new books. I tried to answer the following:
- Are classic books better than modern books?
- What is the most popular genre?
- Does the number of reviews per book and the average rating influence users’ choices of books to read?
- Notebook file
- Data files:
- to_read.csv provides IDs of the books marked "to read" by each user, as user_id,book_id pairs, sorted by time. There are close to a million pairs.
- books.csv has metadata for each book (goodreads IDs, authors, title, average rating, etc.).
- book_tags.csv contains tags/shelves/genres assigned by users to books. Tags in this file are represented by their IDs. They are sorted by goodreads_book_id ascending and count descending.
- tags.csv translates tag IDs to names.
Please check the following blog post Here
Data credits Here