You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: R/main_function_documentation.R
+29-10
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,12 @@ NULL
6
6
#' \loadmathjax
7
7
#' @description \code{nonprob} fits model for inference based on non-probability surveys (including big data) using various methods.
8
8
#' The function allows you to estimate the population mean having access to a reference probability sample as well as total/mean values of covariates.
9
-
#' In the package implemented state-of-the-art approaches recently proposed in the literature: Chen et al. (2020), Yang et al. (2020), Wu (2022) and use `survey` package [Lumley 2004](https://CRAN.R-project.org/package=survey) for inference.
10
-
#' Provided propensity score weighting (e.g. with calibration constraints), mass imputation (e.g. nearest neighbour) and doubly robust estimators that take into account minimization of the asymptotic bias of the population mean estimators,
9
+
#'
10
+
#' In the package implemented state-of-the-art approaches recently proposed in the literature: Chen et al. (2020),
11
+
#' Yang et al. (2020), Wu (2022) and use `survey` package [Lumley 2004](https://CRAN.R-project.org/package=survey) for inference.
12
+
#'
13
+
#' Provided propensity score weighting (e.g. with calibration constraints), mass imputation (e.g. nearest neighbour) and
14
+
#' doubly robust estimators that take into account minimization of the asymptotic bias of the population mean estimators,
11
15
#' variable selection or overlap between random and non-random sample.
12
16
#' The package uses `survey` package functionalities when a probability sample is available.
13
17
#'
@@ -84,7 +88,7 @@ NULL
84
88
#' Inverse probability approach is based on assumption that reference probability sample
85
89
#' is available and therefore we can estimate propensity score of selection mechanism.
#' It opens the the door to very flexible method for imputation model. In the package used generalized linear models from [stats::glm()]
106
110
#' nearest neighbour algorithm using [RANN::nn2()] and predictive mean matching.
107
111
#'
108
112
#' 3. Doubly robust estimation -- The IPW and MI estimators are sensible on misspecified models for propensity score and outcome variable respectively.
109
113
#' For this purpose so called doubly-robust methods, which take into account these problems, are presented.
110
114
#' It is quite simple idea of combination propensity score and imputation models during inference which lead to the following estimator
#' In addition, an approach based directly on bias minimisation has been implemented. Following formula
113
117
#' \mjsdeqn{
114
118
#' \begin{aligned}
@@ -191,14 +195,29 @@ NULL
191
195
#' \item{\code{nonprob_size} -- size of non-probability sample}
192
196
#' \item{\code{prob_size} -- size of probability sample}
193
197
#' \item{\code{pop_size} -- estimated population size derived from estimated weights (non-probability sample) or known design weights (probability sample)}
194
-
#' \item{\code{outcome} -- list containing information about fitting of mass imputation model, in case of regression model, object containing list returned by the function.
195
-
#' Additionaly when variables selection model for outcome variable is fitting, list includes of \code{cve} -- the error for each value of `lambda`, averaged accross the cross-validation folds.
196
-
#' [stats::glm()], in case of nearest neigbour imputation object containing list returned by [RANN::nn2()].}
198
+
#' \item{\code{outcome} -- list containing information about fitting of mass imputation model, in case of regression model, object containing list returned by the function
199
+
#' [stats::glm()], in case of nearest neighbour imputation object containing list returned by [RANN::nn2()]. If `bias_correction` in [controlInf()] is set on `TRUE`, then estimation is based on
200
+
#' the joint estimating equations for `selection` and `outcome` model and therefore, the list differs from the one returned by the [stats::glm()] function and contains elements such as
201
+
#' \itemize{
202
+
#' \item{\code{coefficients} -- estimated coefficients of the regression model}
203
+
#' \item{\code{std_err} -- standard errors of the estimated coefficients}
204
+
#' \item{\code{residuals} -- The response residuals}
205
+
#' \item{\code{variance_covariance} -- The variance-covariance matrix of the coefficient estimates}
206
+
#' \item{\code{df_residual} -- The degrees of freedom for residuals}
207
+
#' \item{\code{family} -- specifies the error distribution and link function to be used in the model}
208
+
#' \item{\code{fitted.values} -- The predicted values of the response variable based on the fitted model}
209
+
#' \item{\code{linear.predictors} -- The linear fit on link scale}
210
+
#' \item{\code{X} -- The design matrix}
211
+
#' \item{\code{method} -- set on `glm`, since the regression method}
212
+
#' }
213
+
#' }
214
+
#' Additionally when variables selection model for outcome variable is fitting, list includes of \code{cve} -- the error for each value of `lambda`, averaged across the cross-validation folds.
197
215
#' \item{\code{selection} -- list containing information about fitting of propensity score model, such as
198
216
#' \itemize{
199
217
#' \item{\code{coefficients} -- a named vector of coefficients}
200
218
#' \item{\code{std_err} -- standard errors of the estimated model coefficients}
201
-
#' \item{\code{residuals} -- the working residuals}
219
+
#' \item{\code{residuals} -- the response residuals}
220
+
#' \item{\code{variance} -- the root mean square error}
202
221
#' \item{\code{fitted_values} -- the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.}
203
222
#' \item{\code{link} -- the `link` object used.}
204
223
#' \item{\code{linear_predictors} -- the linear fit on link scale.}
@@ -208,7 +227,7 @@ NULL
208
227
#' \item{\code{formula} -- the formula supplied.}
209
228
#' \item{\code{df_residual} -- the residual degrees of freedom.}
210
229
#' \item{\code{log_likelihood} -- value of log-likelihood function if `mle` method, in the other case `NULL`.}
211
-
#' \item{\code{cve} -- the error for each value of `lambda`, averaged accross the cross-validation folds for variables selection model
230
+
#' \item{\code{cve} -- the error for each value of `lambda`, averaged across the cross-validation folds for variables selection model
0 commit comments