diff --git a/NEWS.md b/NEWS.md index fcc6031..3cc785f 100644 --- a/NEWS.md +++ b/NEWS.md @@ -2,10 +2,15 @@ ------------------------------------------------------------------------ +### Breaking changes + +- functions `pop.size`, `controlSel`, `controlOut` and `controlInf` were renamed to `pop_size`, `control_sel`, `control_out` and `control_inf` respectively. + ### Features + - two additional datasets have been included: `jvs` (Job Vacancy Survey; a probability sample survey) and `admin` (Central Job Offers Database; a non-probability sample survey). The units and auxiliary variables have been aligned in a way that allows the data to be integrated using the methods implemented in this package. - a `nonprobsvycheck` function was added to check the balance in the totals of the variables based on the weighted weights between the non-probability and probability samples. -- Important - the functions `controlSel`, `controlOut` and `controlInf` have been replaced by their counterparts `control_sel`, `control_out` and `control_inf`. + ### Bugfixes - basic methods and functions related to variance estimation, weights and probability linking methods have been rewritten in a more optimal and readable way. diff --git a/README.Rmd b/README.Rmd index f1a939e..dc3e995 100644 --- a/README.Rmd +++ b/README.Rmd @@ -46,8 +46,7 @@ population or probability sample is available: constraints [@chen2020], - mass imputation estimators based on nearest neighbours [@yang2021], predictive mean matching and regression imputation [@kim2021], -- doubly robust estimators with bias minimization [@chen2020, - @yang2020]. +- doubly robust estimators [@chen2020] with bias minimization [@yang2020]. The package allows for: @@ -193,7 +192,7 @@ nonprob( ..., xk = tau_xk), method_selection = "logit", - control_selection = controlSel(est_method_sel = "gee", h = 1) + control_selection = control_sel(est_method_sel = "gee", h = 1) ) ``` @@ -253,7 +252,7 @@ nonprob( svydesign = prob, method_outcome = "nn", family_outcome = "gaussian", - control_outcome = controlOutcome(k = 2) + control_outcome = control_outcome(k = 2) ) ``` @@ -286,8 +285,8 @@ nonprob( svydesign = prob, method_outcome = "pmm", family_outcome = "gaussian", - control_outcome = controlOut(penalty = "lasso"), - control_inference = controlInf(vars_selection = TRUE) + control_outcome = control_out(penalty = "lasso"), + control_inference = control_inf(vars_selection = TRUE) ) ``` @@ -320,7 +319,7 @@ nonprob( data = nonprob, svydesign = prob, method_selection = "logit", - control_selection = controlSel(est_method_sel = "gee", h = 1) + control_selection = control_sel(est_method_sel = "gee", h = 1) ) ``` @@ -338,7 +337,7 @@ nonprob( svydesign = prob, method_outcome = "pmm", family_outcome = "gaussian", - control_inference = controlInf(vars_selection = TRUE) + control_inference = control_inf(vars_selection = TRUE) ) ``` @@ -373,7 +372,7 @@ nonprob( svydesign = prob, method_outcome = "glm", family_outcome = "gaussian", - control_inference = controlInf( + control_inference = control_inf( vars_selection = TRUE, bias_correction = TRUE ) diff --git a/README.md b/README.md index a29c5ee..6799c51 100644 --- a/README.md +++ b/README.md @@ -31,8 +31,8 @@ population or probability sample is available: - mass imputation estimators based on nearest neighbours ([Yang, Kim, and Hwang 2021](#ref-yang2021)), predictive mean matching and regression imputation ([Kim et al. 2021](#ref-kim2021)), -- doubly robust estimators with bias minimization Yang, Kim, and Song - ([2020](#ref-yang2020)). +- doubly robust estimators ([Chen, Li, and Wu 2020](#ref-chen2020)) with + bias minimization ([Yang, Kim, and Song 2020](#ref-yang2020)). The package allows for: @@ -85,14 +85,14 @@ where set of auxiliary variables (denoted as $\boldsymbol{X}$) is available for both sources while $Y$ and $\boldsymbol{d}$ (or $\boldsymbol{w}$) is present only in probability sample. -| Sample | | Auxiliary variables $\boldsymbol{X}$ | Target variable $Y$ | Design ($\boldsymbol{d}$) or calibrated ($\boldsymbol{w}$) weights | -|-------------------------|----------:|:------------------------------------:|:-------------------:|:------------------------------------------------------------------:| -| $S_A$ (non-probability) | 1 | $\checkmark$ | $\checkmark$ | ? | -| | … | $\checkmark$ | $\checkmark$ | ? | -| | $n_A$ | $\checkmark$ | $\checkmark$ | ? | -| $S_B$ (probability) | $n_A+1$ | $\checkmark$ | ? | $\checkmark$ | -| | … | $\checkmark$ | ? | $\checkmark$ | -| | $n_A+n_B$ | $\checkmark$ | ? | $\checkmark$ | +| Sample | | Auxiliary variables $\boldsymbol{X}$ | Target variable $Y$ | Design ($\boldsymbol{d}$) or calibrated ($\boldsymbol{w}$) weights | +|----|---:|:--:|:--:|:--:| +| $S_A$ (non-probability) | 1 | $\checkmark$ | $\checkmark$ | ? | +| | … | $\checkmark$ | $\checkmark$ | ? | +| | $n_A$ | $\checkmark$ | $\checkmark$ | ? | +| $S_B$ (probability) | $n_A+1$ | $\checkmark$ | ? | $\checkmark$ | +| | … | $\checkmark$ | ? | $\checkmark$ | +| | $n_A+n_B$ | $\checkmark$ | ? | $\checkmark$ | ## Basic functionalities @@ -189,7 +189,7 @@ nonprob( ..., xk = tau_xk), method_selection = "logit", - control_selection = controlSel(est_method_sel = "gee", h = 1) + control_selection = control_sel(est_method_sel = "gee", h = 1) ) ``` @@ -262,7 +262,7 @@ nonprob( svydesign = prob, method_outcome = "nn", family_outcome = "gaussian", - control_outcome = controlOutcome(k = 2) + control_outcome = control_outcome(k = 2) ) ``` @@ -300,8 +300,8 @@ nonprob( svydesign = prob, method_outcome = "pmm", family_outcome = "gaussian", - control_outcome = controlOut(penalty = "lasso"), - control_inference = controlInf(vars_selection = TRUE) + control_outcome = control_out(penalty = "lasso"), + control_inference = control_inf(vars_selection = TRUE) ) ``` @@ -338,7 +338,7 @@ nonprob( data = nonprob, svydesign = prob, method_selection = "logit", - control_selection = controlSel(est_method_sel = "gee", h = 1) + control_selection = control_sel(est_method_sel = "gee", h = 1) ) ``` @@ -359,7 +359,7 @@ nonprob( svydesign = prob, method_outcome = "pmm", family_outcome = "gaussian", - control_inference = controlInf(vars_selection = TRUE) + control_inference = control_inf(vars_selection = TRUE) ) ``` @@ -399,7 +399,7 @@ nonprob( svydesign = prob, method_outcome = "glm", family_outcome = "gaussian", - control_inference = controlInf( + control_inference = control_inf( vars_selection = TRUE, bias_correction = TRUE ) @@ -418,6 +418,7 @@ International Statistical Review 87 (2019): S177-S191 \[section 5.2\] ``` r library(survey) +#> Warning: package 'survival' was built under R version 4.3.3 library(nonprobsvy) set.seed(1234567890) @@ -625,7 +626,8 @@ Work on this package is supported by the National Science Centre, OPUS ## References (selected) -