Global.health's mission is to enable rapid sharing of trusted and open public health data to advance the response to infectious diseases.
This repository contains the servers and scripts that support its data curation efforts.
Should you have any questions please feel free to get in touch via: [email protected]
The data exposed on Global.health was curated using two methods. ~60,000 cases were manually curated by humans analyzing sources and inputting data into spreadsheets. This data was ported from the spreadsheets into the Curator Portal as described here. The rest of the data was automatically ingested from sources through a process described here. Each case is marked as VERIFIED
if a human has confirmed this data is valid or UNVERIFIED
if it has not yet been reviewed.
You can tell if a case was imported from the manually created spreadsheets data in a couple of ways. The case will be marked as created by [email protected]. It will also have a source URL that links to this documentation. The source URL that was used to find data about these cases can be found in the additional sources section of the detailed case view (found by clicking on the table row).
A daily export of case data can be downloaded from the data portal. The data is generated using this script, with this data dictionary.
-
Docker images
-
Tests
-
Monitoring
- The data service in
data-serving/data-service
facilitates CRUD operations with the MongoDB database storing case data. - The curator service in
verification/curator-service/api
serves as the backend for the curator portal, which enables curators to view, enter, update, and verify cases; manage data sources and their ingestion; and manage portal access. - The geocoding service geocodes locations and is used by the data service, but can be used standalone as well.
- The curator UI in
verification/curator-service/ui
is the frontend for the curator portal.
- Getting set up
- Component documentation
- Scripts
- How do I...
This repository and daily data exports are published under the MIT license.
Each automatically ingested data source used has a required license and terms of use attachment, forcing curators to look-up the sources they are setting-up for ingestion.
If you are the owner of a data source included here and would like us to remove data, add or alter an attribution, or add or alter license information, please open an issue on this repository and we will happily consider your request.