This project uses the following datasets:
- National Sample Survey (NSS)
- Village-level shapefiles and census data from The SHRUG
- The data on coal plants is from Global Energy Monitor
- I use agricultural yield estimates from 2001 through 2013 from Gangopadhyay et al. (2022)
- The pollution data is from Hammer et al. (2020)
- Daily wind data (with four estimates per day) is from the National Center for Atmospheric Research
- Monthly pollution (particulate) estimates are from [Hammer et al. (2020)][https://pubs.acs.org/doi/full/10.1021/acs.est.0c01764]
- I clean the National Sample Survey (NSS) data in this script.
- I clean the coal plants data in this script. This script does the following:
- Creates the location of coal plants (lat/lon)
- Creates a matrix of distances between coal plants and all the villages in the SHRUG data
- Calculates the angle from each coal plant to each village
- Downloads daily wind data from 1990-01-01 through 2015 that has information on the previous WEEK
- For each of these data points, calculates whether wind is blowing in the direction of any given village (four wind values per day, so it is a proportion of the day wind blows in that direction)
- Aggregates these daily values to the month level
- This script downloads the temperature and precipitation data.
- I extract the pollution data to villages in this script.
- I extract the agricultural productivity data to villages in this script.
- I match the village-level data and aggregate up to NSS districts in this script.
- I estimate regressions looking at correlations between some variables and coal plant openings in this script.
- I validate that wind direction predicts pollution levels in this script.
- All regressions related to agricultural yield are in this script.
- All regressions related to NSS data are in this script.
- You can find my most recent html slides here.