The goal is to extract valuable insights and answer various problems based on the dataset. The following provides a detailed report of the project's objectives, business problems, solutions, findings, and conclusions.
Dataset has been sourced form kaggle.
LINK: https://www.kaggle.com/code/roninvers/eda-on-netflix-data-of-movie-and-tv-show
The dataset contains total of 7 coloumns. Data coloumns:
- Title: Unique titles for each entry
- Content_type: Whether the entry is a movie or TV show.
- Genre: Assigned genre randomly from the list.
- Release_year: A random year between 1950 and 2023.
- Rating: Various content ratings (G, PG, etc.).
- Duration: Duration for movies in minutes or number of seasons for TV shows.
- Country: Randomly assigned production country.
- Total number of shows (movies+TV_shows)
- Total number of movies and tv shows
- Find the ratings for movies and TV shows
- Find the top three countries with the most content on netflix
- Listing the all TV shows and Movies in year 2023
- Total number of movies and TV shows released in 2023
- Number of TV shows and number of Movies released in year 2023
- Find the number of genre and movies or TV shows relased in respective genre
- Finding the Genre having the maximum number of shows relased
- Finding the total number of countries and movies relased in the respective countries
- Identify the longest movie duration.
- List of number of all the TV shows with 3 seasons with their respective Genre
- Maximum number of movies relased in a specific year i.e 2022
- Ratings of "R" in tv shows accoridng to their respective genre
This project is part of my portfolio, showcasing the SQL skills essential for data analyst roles. If you have any questions, feedback, or would like to collaborate, feel free to get in touch!