On March 11, 2020, the World Health Organization (WHO) declared the Covid-19 (a.k.a. new coronavirus) a pandemic. Since January 22, 2020, the Johns Hopkins CSSE maintains a data repository to track the Covid-19 incidence worldwide. In order to understand a little bit how this disease will affect my country (Brazil), I performed some data analysis in this data.
For Portuguese speakers, I wrote a post in my blog about this analysis: O que os dados dizem sobre o Coronavírus?
Country/Region | Confirmed | Deaths | % Deaths | % Population |
---|---|---|---|---|
US | 337072 | 9619 | 2.85369 | 0.103027 |
Spain | 131646 | 12641 | 9.60227 | 0.281754 |
Italy | 128948 | 15887 | 12.3205 | 0.21338 |
Germany | 100123 | 1584 | 1.58205 | 0.120735 |
France | 93773 | 8093 | 8.63042 | 0.139986 |
China | 82602 | 3333 | 4.03501 | 0.00593094 |
Iran | 58226 | 3603 | 6.18796 | 0.0711807 |
United Kingdom | 48436 | 4943 | 10.2052 | 0.0728482 |
Turkey | 27069 | 574 | 2.12051 | 0.0328828 |
Switzerland | 21100 | 715 | 3.38863 | 0.247753 |
The analysis was coded in Python using Jupyter Notebook. To install the requirement:
pip install requirements.txt
First, run the get_data.ipynb
script to get the most updated data from the Johns Hopkins repository.
Next, run the analysis.ipynb
code and have fun
Some people are also working on this data and providing some insightful analysis on Kaggle. You may want to check them as well:
- COVID-19 - Analysis, Viz, Prediction & Comparisons
- Coronavirus (COVID-19) Visualization & Prediction
- Novel Corona Virus 2019 Dataset
If you find some bug or have any further question please let me know