Skip to content

Commit

Permalink
information about the breaking changes, updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
BERENZ committed Jan 23, 2025
1 parent faebafd commit c2659f6
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 28 deletions.
7 changes: 6 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,15 @@

------------------------------------------------------------------------

### Breaking changes

- functions `pop.size`, `controlSel`, `controlOut` and `controlInf` were renamed to `pop_size`, `control_sel`, `control_out` and `control_inf` respectively.

### Features

- two additional datasets have been included: `jvs` (Job Vacancy Survey; a probability sample survey) and `admin` (Central Job Offers Database; a non-probability sample survey). The units and auxiliary variables have been aligned in a way that allows the data to be integrated using the methods implemented in this package.
- a `nonprobsvycheck` function was added to check the balance in the totals of the variables based on the weighted weights between the non-probability and probability samples.
- Important - the functions `controlSel`, `controlOut` and `controlInf` have been replaced by their counterparts `control_sel`, `control_out` and `control_inf`.


### Bugfixes
- basic methods and functions related to variance estimation, weights and probability linking methods have been rewritten in a more optimal and readable way.
Expand Down
17 changes: 8 additions & 9 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,7 @@ population or probability sample is available:
constraints [@chen2020],
- mass imputation estimators based on nearest neighbours [@yang2021],
predictive mean matching and regression imputation [@kim2021],
- doubly robust estimators with bias minimization [@chen2020,
@yang2020].
- doubly robust estimators [@chen2020] with bias minimization [@yang2020].

The package allows for:

Expand Down Expand Up @@ -193,7 +192,7 @@ nonprob(
...,
xk = tau_xk),
method_selection = "logit",
control_selection = controlSel(est_method_sel = "gee", h = 1)
control_selection = control_sel(est_method_sel = "gee", h = 1)
)
```
</td>
Expand Down Expand Up @@ -253,7 +252,7 @@ nonprob(
svydesign = prob,
method_outcome = "nn",
family_outcome = "gaussian",
control_outcome = controlOutcome(k = 2)
control_outcome = control_outcome(k = 2)
)
```
</td>
Expand Down Expand Up @@ -286,8 +285,8 @@ nonprob(
svydesign = prob,
method_outcome = "pmm",
family_outcome = "gaussian",
control_outcome = controlOut(penalty = "lasso"),
control_inference = controlInf(vars_selection = TRUE)
control_outcome = control_out(penalty = "lasso"),
control_inference = control_inf(vars_selection = TRUE)
)
```
</td>
Expand Down Expand Up @@ -320,7 +319,7 @@ nonprob(
data = nonprob,
svydesign = prob,
method_selection = "logit",
control_selection = controlSel(est_method_sel = "gee", h = 1)
control_selection = control_sel(est_method_sel = "gee", h = 1)
)
```
</td>
Expand All @@ -338,7 +337,7 @@ nonprob(
svydesign = prob,
method_outcome = "pmm",
family_outcome = "gaussian",
control_inference = controlInf(vars_selection = TRUE)
control_inference = control_inf(vars_selection = TRUE)
)
```
</td>
Expand Down Expand Up @@ -373,7 +372,7 @@ nonprob(
svydesign = prob,
method_outcome = "glm",
family_outcome = "gaussian",
control_inference = controlInf(
control_inference = control_inf(
vars_selection = TRUE,
bias_correction = TRUE
)
Expand Down
38 changes: 20 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ population or probability sample is available:
- mass imputation estimators based on nearest neighbours ([Yang, Kim,
and Hwang 2021](#ref-yang2021)), predictive mean matching and
regression imputation ([Kim et al. 2021](#ref-kim2021)),
- doubly robust estimators with bias minimization Yang, Kim, and Song
([2020](#ref-yang2020)).
- doubly robust estimators ([Chen, Li, and Wu 2020](#ref-chen2020)) with
bias minimization ([Yang, Kim, and Song 2020](#ref-yang2020)).

The package allows for:

Expand Down Expand Up @@ -85,14 +85,14 @@ where set of auxiliary variables (denoted as $\boldsymbol{X}$) is
available for both sources while $Y$ and $\boldsymbol{d}$ (or
$\boldsymbol{w}$) is present only in probability sample.

| Sample | | Auxiliary variables $\boldsymbol{X}$ | Target variable $Y$ | Design ($\boldsymbol{d}$) or calibrated ($\boldsymbol{w}$) weights |
|-------------------------|----------:|:------------------------------------:|:-------------------:|:------------------------------------------------------------------:|
| $S_A$ (non-probability) | 1 | $\checkmark$ | $\checkmark$ | ? |
| | | $\checkmark$ | $\checkmark$ | ? |
| | $n_A$ | $\checkmark$ | $\checkmark$ | ? |
| $S_B$ (probability) | $n_A+1$ | $\checkmark$ | ? | $\checkmark$ |
| | | $\checkmark$ | ? | $\checkmark$ |
| | $n_A+n_B$ | $\checkmark$ | ? | $\checkmark$ |
| Sample | | Auxiliary variables $\boldsymbol{X}$ | Target variable $Y$ | Design ($\boldsymbol{d}$) or calibrated ($\boldsymbol{w}$) weights |
|----|---:|:--:|:--:|:--:|
| $S_A$ (non-probability) | 1 | $\checkmark$ | $\checkmark$ | ? |
| | | $\checkmark$ | $\checkmark$ | ? |
| | $n_A$ | $\checkmark$ | $\checkmark$ | ? |
| $S_B$ (probability) | $n_A+1$ | $\checkmark$ | ? | $\checkmark$ |
| | | $\checkmark$ | ? | $\checkmark$ |
| | $n_A+n_B$ | $\checkmark$ | ? | $\checkmark$ |

## Basic functionalities

Expand Down Expand Up @@ -189,7 +189,7 @@ nonprob(
...,
xk = tau_xk),
method_selection = "logit",
control_selection = controlSel(est_method_sel = "gee", h = 1)
control_selection = control_sel(est_method_sel = "gee", h = 1)
)
```

Expand Down Expand Up @@ -262,7 +262,7 @@ nonprob(
svydesign = prob,
method_outcome = "nn",
family_outcome = "gaussian",
control_outcome = controlOutcome(k = 2)
control_outcome = control_outcome(k = 2)
)
```

Expand Down Expand Up @@ -300,8 +300,8 @@ nonprob(
svydesign = prob,
method_outcome = "pmm",
family_outcome = "gaussian",
control_outcome = controlOut(penalty = "lasso"),
control_inference = controlInf(vars_selection = TRUE)
control_outcome = control_out(penalty = "lasso"),
control_inference = control_inf(vars_selection = TRUE)
)
```

Expand Down Expand Up @@ -338,7 +338,7 @@ nonprob(
data = nonprob,
svydesign = prob,
method_selection = "logit",
control_selection = controlSel(est_method_sel = "gee", h = 1)
control_selection = control_sel(est_method_sel = "gee", h = 1)
)
```

Expand All @@ -359,7 +359,7 @@ nonprob(
svydesign = prob,
method_outcome = "pmm",
family_outcome = "gaussian",
control_inference = controlInf(vars_selection = TRUE)
control_inference = control_inf(vars_selection = TRUE)
)
```

Expand Down Expand Up @@ -399,7 +399,7 @@ nonprob(
svydesign = prob,
method_outcome = "glm",
family_outcome = "gaussian",
control_inference = controlInf(
control_inference = control_inf(
vars_selection = TRUE,
bias_correction = TRUE
)
Expand All @@ -418,6 +418,7 @@ International Statistical Review 87 (2019): S177-S191 \[section 5.2\]

``` r
library(survey)
#> Warning: package 'survival' was built under R version 4.3.3
library(nonprobsvy)

set.seed(1234567890)
Expand Down Expand Up @@ -625,7 +626,8 @@ Work on this package is supported by the National Science Centre, OPUS

## References (selected)

<div id="refs" class="references csl-bib-body hanging-indent">
<div id="refs" class="references csl-bib-body hanging-indent"
entry-spacing="0">

<div id="ref-chen2020" class="csl-entry">

Expand Down

0 comments on commit c2659f6

Please sign in to comment.