Skip to content

The goal of ghedata is to share data resources that document the work of the Global Health Engineering group at ETH Zurich.

License

Notifications You must be signed in to change notification settings

Global-Health-Engineering/ghedata

Repository files navigation

ghedata

License: CC BY 4.0

The goal of ghedata is to share data resources that document the work of the Global Health Engineering group at ETH Zurich. The data can be used for research, teaching, and learning purposes. The data is available as an R data package, in CSV and XLSX formats.

Installation

You can install the development version of ghedata from GitHub with:

# install.packages("devtools")
devtools::install_github("global-health-engineering/ghedata")
## Run the following code in console if you don't have the packages
## install.packages(c("dplyr", "knitr", "readr", "stringr", "gt", "kableExtra"))
library(dplyr)
library(knitr)
library(readr)
library(stringr)
library(gt)
library(kableExtra)
library(ggthemes)
library(ggplot2)

Alternatively, you can download the individual datasets as a CSV or XLSX file from the table below.

  1. Click Download CSV. A window opens that displays the CSV in your browser.
  2. Right-click anywhere inside the window and select “Save Page As…”.
  3. Save the file in a folder of your choice.
dataset CSV XLSX
people Download CSV Download XLSX

Data

The package provides access to one data resource.

library(ghedata)

people

The dataset people contains data about people that have worked for the group. This includes staff, as well as undergraduate student projects.

It has 96 observations and 7 variables.

people |> 
  head(3) |> 
  gt::gt() |>
  gt::as_raw_html()
project_id degree type b_m_student start_date year thesis_title
2021-bsc-thesis-141444 bsc thesis yes 2021-02-01 2021 Self-cleaning Preliminary Screening Mechanisms to Improve Health and Wastewater Treatment Efficiency in Low-income Countries
2022-msc-sem-proj-565554 msc sem-proj yes 2022-04-04 2022 The state of open research data publishing practices and policies in the fields of global health engineering
2022-bsc-thesis-466263 bsc thesis yes 2022-03-11 2022 Feasibility of waste maize husk valorization as a plastic plate replacement in Durban, South Africa

For an overview of the variable names, see the following table.

variable_name

variable_type

description

project_id

character

Unique identifier for each person using a combination of other metadata. Folder name on Google Drive.

degree

character

Categorical variable with four levels: bsc, msc, phd, staff.

type

character

Categorical variable with six levels: hiwi, intern, post-doc, scientific-assitant, sem-proj, thesis.

b_m_student

character

Binary variable to identify if person is BSc or MSc students. Levels: yes, no.

start_date

Date

Start date of the person.

year

numeric

Year of the start date.

thesis_title

character

Title of the thesis.

Example

library(ghedata)

undergrad_students <- people |> 
  filter(b_m_student == "yes") |>
  filter(!is.na(project_id)) 

So far, GHE has supervised 81 projects of which 19 were done by BSc and 62 by MSc students.

undergrad_students |> 
  count(degree, year) |> 
  ggplot(aes(x = year, y = n, label = n, fill = degree, color = degree)) +
  geom_col(position = "dodge") +
  geom_text(position=position_dodge(width=0.9), 
            vjust=-0.5,
            show.legend = FALSE) +
  labs(x = "",
       y = "Number of students") +
  scale_fill_colorblind() +
  scale_color_colorblind() +
  theme_minimal() +
  theme(panel.grid = element_blank(),
        axis.text.y = element_blank())

License

Data are available as CC-BY.

Citation

Please cite this package using:

citation("ghedata")
#> To cite package 'ghedata' in publications use:
#> 
#>   Schöbitz L (2024). _ghedata: What the Package Does (One Line, Title
#>   Case)_. R package version 0.0.0.9000, commit
#>   c7010eceb5a6d1d0bfbdc41a9682107e26b77c51,
#>   <https://github.com/global-health-engineering/ghedata>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {ghedata: What the Package Does (One Line, Title Case)},
#>     author = {Lars Schöbitz},
#>     year = {2024},
#>     note = {R package version 0.0.0.9000, commit c7010eceb5a6d1d0bfbdc41a9682107e26b77c51},
#>     url = {https://github.com/global-health-engineering/ghedata},
#>   }

About

The goal of ghedata is to share data resources that document the work of the Global Health Engineering group at ETH Zurich.

Resources

License

Stars

Watchers

Forks

Languages