-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
130 lines (100 loc) · 4.56 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
---
output: github_document
bibliography: "references.bib"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# independenceWeights
<!-- badges: start -->
[![version](http://www.r-pkg.org/badges/version/independenceWeights)](https://cran.r-project.org/package=independenceWeights)
<!-- badges: end -->
The `independenceWeights` package constructs weights designed to minimize the weighted statistical dependence between a continuous exposure variable and a vector of confounder variables and implements the methods of @huling2021independence for doing so. In estimating a causal dose-response function, confounding bias is a function of the dependence between confounders and the exposure/treatment, so weights that minimize the dependence aim to mitigate confounding bias directly.
## Installation
You can install the released version of independenceWeights from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("independenceWeights")
```
Install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("jaredhuling/independenceWeights")
```
## Example
This is a basic example which shows how to estimate and utilize the distance covariance optimal weights (DCOWs) of @huling2021independence:
```{r load, message=FALSE, warning=FALSE}
library(independenceWeights)
```
Simulate data with a continuous treatment that has a confounded relationship with a response. Data are simulated according to the simulation setup of @vegetabile2021nonparametric.
```{r sim}
simdat <- simulate_confounded_data(seed = 999, nobs = 500)
y <- simdat$data$Y ## response
A <- simdat$data$A ## treatment
X <- as.matrix(simdat$data[c("Z1", "Z2", "Z3", "Z4", "Z5")]) ## confounders
```
Now estimate weights to adjust for confounders using the distance covariance optimal weights (DCOWs), which aim to mitigate the dependence between $A$ and $X$:
```{r dcows}
dcows <- independence_weights(A, X)
dcows
```
Alternatively, information about any set of weights can be printed via `weighted_energy_stats()`.
The unweighted distance covariance is a measure of the unadjusted dependence between confounders
and treatment. The weighted dependence distance is a measure of this dependence *after* weighting.
The weighted energy distance between A and weighted A measures how well the marginal distribution
of A is preserved after weighting. Similarly for X. The effective sample size estimates the effective
sample size after weighting.
```{r wtstats}
weighted_energy_stats(A, X, dcows$weights)
```
We can assess traditional measures of balance such as marginal weighted correlations with `cobalt`:
```{r message=FALSE,warning=FALSE}
library(cobalt)
```
Even though the DCOWs aim to mitigate joint dependence between X and A, they result in very small
marginal weighted correlations between them as well. (changing `poly` to 10 below would reveal that
even up to 10th degree polynomials of covariates as well-decorrelated):
```{r}
bal.tab(A ~ Z1 + Z2 + Z3 + Z4 + Z5, data = data.frame(A = A, X),
weights = list(DCOW = dcows$weights),
un = TRUE, int = TRUE, poly = 1, stats = "cor")
```
We can also minimize the weighted dependence subject to the constraint that
marginal moments of X and A are decorrelated:
```{r dcowsdecor}
dcows_decor <- independence_weights(A, X,
decorrelate_moments = TRUE)
dcows_decor
```
```{r}
bal.tab(A ~ Z1 + Z2 + Z3 + Z4 + Z5, data = data.frame(A = A, X),
weights = list(DCOW = dcows$weights,
DCOW_decor = dcows_decor$weights),
un = TRUE, int = TRUE, stats = "cor")
```
Now use the weights to estimate the causal average dose response function (ADRF)
```{r adrf, fig.height = 6}
## create grid
trt_vec <- seq(min(simdat$data$A), 50, length.out=500)
## estimate ADRF
adrf_hat <- weighted_kernel_est(A, y, dcows$weights, trt_vec)$est
## estimate naively without weights
adrf_hat_unwtd <- weighted_kernel_est(A, y, rep(1, length(y)), trt_vec)$est
ylims <- c(-4.75, 4.75)
plot(x = simdat$data$A, y = simdat$data$Y, ylim = ylims,
xlim = c(0,50),
xlab = "A", ylab = "Y")
## true ADRF
lines(x = trt_vec, y = simdat$true_adrf(trt_vec), col = "blue", lwd=2)
## estimated ADRF
lines(x = trt_vec, y = adrf_hat, col = "red", lwd=2)
## naive estimate
lines(x = trt_vec, y = adrf_hat_unwtd, col = "green", lwd=2)
legend("bottomleft", c("True ADRF", "Unweighted Est.", "DCOW-weighted Est."),
col = c("blue", "green", "red"), lty = 1, lwd = 2)
```
## References