Dataset:- The data set is an export of Strava and GoldenCheetah, an online social networking site for cycling and other sport.
Plotting and vistualization:- This Portfolio provides summary of cycling data which includes plotting histograms, heatmap, and subplots to represent clear and concise visualization of cycling data.
Dataset:- The data is being made available by Johns Hopkins University in this GitHub.The plotting is performed for the confirmed cases. The portfolio shows visualization and analysis of COVID cases across several countries through line graph. Following feature has also been implemented:-
Running Bar graph:- Top 10 Counties With COVID-19 Cases Per Million YouTube. It is also available in the dataset.
Simple linear model to predict COVID cases particularly in US and China:- It contains simple linear model techniques to predict the log of the number of case across US. Also, the reason for China's data not showing exponential growth and it's act to stop the virus.
Dataset:- We have use a set of book summeries from the CMU Book Summaries Corpus. It contains a large number of summaries (16,559) and includes meta-data about the genre of the books taken from Freebase.
Model used:- We have used MultiNomial NB model and Logistic Regression to predict the model.
Libraries:-
TfidfVectorizer:- We have used this library to convert a collection of raw documents to a matrix of TF-IDF (Term frequency and Inverse Document Frequency).