diff --git a/NAMESPACE b/NAMESPACE index 5436481..42f8292 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -2,29 +2,29 @@ S3method(AIC,nonprobsvy) S3method(BIC,nonprobsvy) +S3method(check_balance,nonprobsvy) S3method(confint,nonprobsvy) S3method(cooks.distance,nonprobsvy) S3method(deviance,nonprobsvy) S3method(hatvalues,nonprobsvy) S3method(logLik,nonprobsvy) S3method(nobs,nonprobsvy) -S3method(nonprobsvycheck,nonprobsvy) -S3method(pop.size,nonprobsvy) +S3method(pop_size,nonprobsvy) S3method(print,nonprobsvy) S3method(print,nonprobsvycheck) S3method(print,summary_nonprobsvy) S3method(residuals,nonprobsvy) S3method(summary,nonprobsvy) S3method(vcov,nonprobsvy) +export(check_balance) export(cloglog_model_nonprobsvy) -export(controlInf) -export(controlOut) -export(controlSel) +export(control_inf) +export(control_out) +export(control_sel) export(genSimData) export(logit_model_nonprobsvy) export(nonprob) -export(nonprobsvycheck) -export(pop.size) +export(pop_size) export(probit_model_nonprobsvy) import(Rcpp) import(mathjaxr) diff --git a/NEWS.md b/NEWS.md index 44ff054..fcc6031 100644 --- a/NEWS.md +++ b/NEWS.md @@ -5,10 +5,15 @@ ### Features - two additional datasets have been included: `jvs` (Job Vacancy Survey; a probability sample survey) and `admin` (Central Job Offers Database; a non-probability sample survey). The units and auxiliary variables have been aligned in a way that allows the data to be integrated using the methods implemented in this package. - a `nonprobsvycheck` function was added to check the balance in the totals of the variables based on the weighted weights between the non-probability and probability samples. +- Important - the functions `controlSel`, `controlOut` and `controlInf` have been replaced by their counterparts `control_sel`, `control_out` and `control_inf`. ### Bugfixes - basic methods and functions related to variance estimation, weights and probability linking methods have been rewritten in a more optimal and readable way. +### Documentation + +- annotation has been added that arguments such as `strata`, `subset` and `na_action` are not supported for the time being. + # nonprobsvy 0.1.1 ------------------------------------------------------------------------ @@ -19,13 +24,13 @@ - bug Fix related to storing `vector` in `model_frame` when predicting `y_hat` in mass imputation `glm` model when X is based in one auxiliary variable only - fix provided converting it to `data.frame` object. ### Features -- add information to `summary` about quality of estimation basing on difference between estimated and known total values of auxiliary variables -- add estimation of exact standard error for k-nearest neighbor estimator. -- add breaking change to `controlOut` function by switching values for `predictive_match` argument. From now on, the `predictive_match = 1` means $\hat{y}-\hat{y}$ in predictive mean matching imputation and `predictive_match = 2` corresponds to $\hat{y}-y$ matching. -- implement `div` option when variable selection (more in documentation) for doubly robust estimation. -- add more insights to `nonprob` output such as gradient, hessian and jacobian derived from IPW estimation for `mle` and `gee` methods when `IPW` or `DR` model executed. -- add estimated inclusion probabilities and its derivatives for probability and non-probability samples to `nonprob` output when `IPW` or `DR` model executed. -- add `model_frame` matrix data from probability sample used for mass imputation to `nonprob` when `MI` or `DR` model executed. +- added information to `summary` about quality of estimation basing on difference between estimated and known total values of auxiliary variables +- added estimation of exact standard error for k-nearest neighbor estimator. +- added breaking change to `controlOut` function by switching values for `predictive_match` argument. From now on, the `predictive_match = 1` means $\hat{y}-\hat{y}$ in predictive mean matching imputation and `predictive_match = 2` corresponds to $\hat{y}-y$ matching. +- implemented `div` option when variable selection (more in documentation) for doubly robust estimation. +- added more insights to `nonprob` output such as gradient, hessian and jacobian derived from IPW estimation for `mle` and `gee` methods when `IPW` or `DR` model executed. +- added estimated inclusion probabilities and its derivatives for probability and non-probability samples to `nonprob` output when `IPW` or `DR` model executed. +- added `model_frame` matrix data from probability sample used for mass imputation to `nonprob` when `MI` or `DR` model executed. ### Unit tests - added unit tests for variable selection models and mi estimation with vector of population totals available diff --git a/R/bias_correction_ipw.R b/R/bias_correction_ipw.R index 3d76939..2df0ae3 100644 --- a/R/bias_correction_ipw.R +++ b/R/bias_correction_ipw.R @@ -57,10 +57,10 @@ mm <- function(X, "2" = warning("Relatively convergent algorithm when fitting selection model by nleqslv, but user must check if function values are acceptably small."), "3" = warning("Algorithm did not find suitable point - has stalled cannot find an acceptable new point when fitting selection model by nleqslv."), "4" = warning("Iteration limit exceeded when fitting selection model by nleqslv."), - "5" = warning("ill-conditioned Jacobian when fitting selection model by nleqslv."), + "5" = warning("Ill-conditioned Jacobian when fitting selection model by nleqslv."), "6" = warning("Jacobian is singular when fitting selection model by nleqslv."), "7" = warning("Jacobian is unusable when fitting selection model by nleqslv."), - "-10" = warning("user specified Jacobian is incorrect when fitting selection model by nleqslv.") + "-10" = warning("User specified Jacobian is incorrect when fitting selection model by nleqslv.") ) } diff --git a/R/cloglogModel.R b/R/cloglogModel.R index 7b3d667..b7b2aa3 100644 --- a/R/cloglogModel.R +++ b/R/cloglogModel.R @@ -13,6 +13,7 @@ #' @importFrom maxLik maxLik #' @importFrom Matrix Matrix #' @importFrom survey svyrecvar +#' @keywords internal #' @export # must be exported to be visible in c++ script, to consider any other option cloglog_model_nonprobsvy <- function(...) { @@ -153,9 +154,9 @@ cloglog_model_nonprobsvy <- function(...) { if (maxLik_an$convergence %in% c(1, 10, 51, 52)) { switch(as.character(maxLik_an$convergence), "1" = warning("Warning in fitting selection model with optim: the iteration limit maxit had been reached."), - "10" = warning("degeneracy of the Nelder Mead simplex in fitting selection model by optim."), # TODO - - "51" = warning("warning from the L BFGS B when fitting by optim."), # TODO - - "52" = stop("indicates an error from the L-BFGS-B method when fitting by optim.") + "10" = warning("Degeneracy of the Nelder Mead simplex in fitting selection model by optim."), # TODO - + "51" = warning("Warning from the L BFGS B when fitting by optim."), # TODO - + "52" = stop("Indicates an error from the L-BFGS-B method when fitting by optim.") ) } theta <- maxLik_an$par diff --git a/R/control_inference.R b/R/control_inference.R index a2d1199..165dad0 100644 --- a/R/control_inference.R +++ b/R/control_inference.R @@ -1,31 +1,33 @@ #' @title Control parameters for inference -#' @description \code{controlInf} constructs a list with all necessary control parameters +#' +#' @description \code{control_inf} constructs a list with all necessary control parameters #' for statistical inference. -#' @param vars_selection If `TRUE`, then variables selection model is used. -#' @param var_method variance method. -#' @param rep_type replication type for weights in the bootstrap method for variance estimation passed to [survey::as.svrepdesign()]. +#' +#' @param vars_selection If `TRUE`, then the variables selection model is used. +#' @param var_method the variance method. +#' @param rep_type the replication type for weights in the bootstrap method for variance estimation passed to [survey::as.svrepdesign()]. #' Default is `subbootstrap`. -#' @param bias_inf inference method in the bias minimization. +#' @param bias_inf the inference method in the bias minimization. #' \itemize{ -#' \item if \code{union} then final model is fitting on union of selected variables for selection and outcome models -#' \item if \code{div} then final model is fitting separately on division of selected variables into relevant ones for +#' \item if \code{union}, then the final model is fitted on the union of selected variables for selection and outcome models +#' \item if \code{div}, then the final model is fitted separately on division of selected variables into relevant ones for #' selection and outcome model. #' } -#' @param bias_correction if `TRUE`, then bias minimization estimation used during fitting the model. -#' @param num_boot number of iteration for bootstrap algorithms. -#' @param alpha Significance level, Default is 0.05. -#' @param cores Number of cores in parallel computing. -#' @param keep_boot Logical indicating whether statistics from bootstrap should be kept. +#' @param bias_correction if `TRUE`, then the bias minimization estimation used during model fitting. +#' @param num_boot the number of iteration for bootstrap algorithms. +#' @param alpha significance level, 0.05 by defult. +#' @param cores the number of cores in parallel computing. +#' @param keep_boot a logical value indicating whether statistics from bootstrap should be kept. #' By default set to \code{TRUE} -#' @param nn_exact_se Logical value indicating whether to compute the exact +#' @param nn_exact_se a logical value indicating whether to compute the exact #' standard error estimate for \code{nn} or \code{pmm} estimator. The variance estimator for #' estimation based on \code{nn} or \code{pmm} can be decomposed into three parts, with the -#' third being computed using covariance between imputed values for units in -#' probability sample using predictive matches from non-probability sample. +#' third computed using covariance between imputed values for units in +#' the probability sample using predictive matches from the non-probability sample. #' In most situations this term is negligible and is very computationally -#' expensive so by default this is set to \code{FALSE}, but it is recommended to -#' set this value to \code{TRUE} before submitting final results. -#' @param pi_ij TODO, either matrix or \code{ppsmat} class object. +#' expensive so by default it is set to \code{FALSE}, but the recommended option is to +#' set this value to \code{TRUE} before submitting the final results. +#' @param pi_ij TODO, either a matrix or a \code{ppsmat} class object. #' #' #' @return List with selected parameters. @@ -36,7 +38,7 @@ #' #' @export -controlInf <- function(vars_selection = FALSE, +control_inf <- function(vars_selection = FALSE, var_method = c( "analytic", "bootstrap" diff --git a/R/control_outcome.R b/R/control_outcome.R index ed4ea58..039d19f 100644 --- a/R/control_outcome.R +++ b/R/control_outcome.R @@ -1,6 +1,8 @@ #' @title Control parameters for outcome model -#' @description \code{controlOut} constructs a list with all necessary control parameters +#' +#' @description \code{control_out} constructs a list with all necessary control parameters #' for outcome model. +#' #' @param epsilon Tolerance for fitting algorithms. Default is \code{1e-6}. #' @param maxit Maximum number of iterations. #' @param trace logical value. If `TRUE` trace steps of the fitting algorithms. Default is `FALSE`. @@ -44,7 +46,7 @@ #' #' @export -controlOut <- function(epsilon = 1e-4, +control_out <- function(epsilon = 1e-4, maxit = 100, trace = FALSE, k = 1, diff --git a/R/control_selection.R b/R/control_selection.R index 940a316..be3ca16 100644 --- a/R/control_selection.R +++ b/R/control_selection.R @@ -1,8 +1,6 @@ #' @title Control parameters for selection model -#' @author Łukasz Chrostowski, Maciej Beręsewicz -#' \loadmathjax #' -#' @description \code{controlSel} constructs a list with all necessary control parameters +#' @description \code{control_sel} constructs a list with all necessary control parameters #' for selection model. #' #' @@ -22,7 +20,7 @@ #' \frac{\pi(\mathbf{x}, \boldsymbol{\theta})}{\mathbf{x}}} #' \item if \code{2} then \mjseqn{ \mathbf{h}\left(\mathbf{x}, \boldsymbol{\theta}\right) = \mathbf{x}} #' } -#' @param penalty The penanlization function used during variables selection. +#' @param penalty The penalization function used during variables selection. #' @param a_SCAD The tuning parameter of the SCAD penalty for selection model. Default is 3.7. #' @param a_MCP The tuning parameter of the MCP penalty for selection model. Default is 3. #' @param lambda A user-specified \mjseqn{\lambda} value during variable selection model fitting. @@ -48,7 +46,7 @@ #' #' @export -controlSel <- function(method = "glm.fit", # perhaps another control function for model with variables selection +control_sel <- function(method = "glm.fit", # perhaps another control function for model with variables selection epsilon = 1e-4, maxit = 500, trace = FALSE, diff --git a/R/data.R b/R/data.R index 8f965a4..0e65194 100644 --- a/R/data.R +++ b/R/data.R @@ -1,7 +1,7 @@ #' Job Vacancy Survey #' #' @description -#' This is a subset of the subset of the Job Vacancy Survey from Poland (for one quarter). +#' This is a subset of the Job Vacancy Survey from Poland (for one quarter). #' The data has been subject to slight manipulation, but the relationships in the data have been preserved. #' For further details on the JVS, please refer to the following link: #' \url{https://stat.gov.pl/obszary-tematyczne/rynek-pracy/popyt-na-prace/zeszyt-metodologiczny-popyt-na-prace,3,1.html}. diff --git a/R/internals.R b/R/internals.R index 92dd935..22d1581 100644 --- a/R/internals.R +++ b/R/internals.R @@ -124,7 +124,7 @@ start_fit <- function(X, weights, weights_rand, method_selection, - control_selection = controlSel()) { + control_selection = control_sel()) { weights_to_glm <- c(weights_rand, weights) start_model <- stats::glm.fit( x = X, # glm model for initial values in propensity score estimation @@ -207,7 +207,7 @@ nonprobMI_fit <- function(outcome, svydesign = NULL, family_outcome = "gaussian", start = NULL, - control_outcome = controlOut(), + control_outcome = control_out(), verbose = FALSE, model = TRUE, x = FALSE, diff --git a/R/logitModel.R b/R/logitModel.R index 4bdb339..a1ea3d3 100644 --- a/R/logitModel.R +++ b/R/logitModel.R @@ -18,6 +18,7 @@ #' @importFrom stats qlogis #' #' +#' @keywords internal #' @export # must be exported to be visible in c++ script, to consider any other option logit_model_nonprobsvy <- function(...) { @@ -133,8 +134,8 @@ logit_model_nonprobsvy <- function(...) { switch(as.character(maxLik_an$convergence), "1" = warning("Warning in fitting selection model with optim: the iteration limit maxit had been reached."), "10" = warning("degeneracy of the Nelder Mead simplex in fitting selection model by optim."), # TODO - - "51" = warning("warning from the L BFGS B when fitting by optim."), # TODO - - "52" = stop("indicates an error from the L BFGS B method when fitting by optim.") + "51" = warning("Warning from the L BFGS B when fitting by optim."), # TODO - + "52" = stop("Indicates an error from the L BFGS B method when fitting by optim.") ) } diff --git a/R/main_function_documentation.R b/R/main_function_documentation.R index c16d6ce..d4d63d5 100644 --- a/R/main_function_documentation.R +++ b/R/main_function_documentation.R @@ -1,10 +1,10 @@ #' @import mathjaxr NULL -#' @title Inference with the non-probability survey samples +#' @title Inference with non-probability survey samples #' @author Łukasz Chrostowski, Maciej Beręsewicz #' #' \loadmathjax -#' @description \code{nonprob} fits model for inference based on non-probability surveys (including big data) using various methods. +#' @description \code{nonprob} fits a model for inference based on non-probability surveys (including big data) using various methods. #' The function allows you to estimate the population mean with access to a reference probability sample, as well as sums and means of covariates. #' #' The package implements state-of-the-art approaches recently proposed in the literature: Chen et al. (2020), @@ -13,32 +13,32 @@ NULL #' It provides propensity score weighting (e.g. with calibration constraints), mass imputation (e.g. nearest neighbour) and #' doubly robust estimators that take into account minimisation of the asymptotic bias of the population mean estimators or #' variable selection. -#' The package uses `survey` package functionality when a probability sample is available. +#' The package uses the `survey` package functionality when a probability sample is available. #' #' -#' @param data `data.frame` with data from the non-probability sample. -#' @param selection `formula`, the selection (propensity) equation. -#' @param outcome `formula`, the outcome equation. -#' @param target `formula` with target variables. -#' @param svydesign an optional `svydesign` object (from the survey package) containing probability sample and design weights. +#' @param data a `data.frame` with data from the non-probability sample. +#' @param selection a `formula`, the selection (propensity) equation. +#' @param outcome a `formula`, the outcome equation. +#' @param target a `formula` with target variables. +#' @param svydesign an optional `svydesign` object (from the survey package) containing a probability sample and design weights. #' @param pop_totals an optional `named vector` with population totals of the covariates. #' @param pop_means an optional `named vector` with population means of the covariates. -#' @param pop_size an optional `double` with population size. -#' @param method_selection a `character` with method for propensity scores estimation. -#' @param method_outcome a `character` with method for response variable estimation. -#' @param family_outcome a `character` string describing the error distribution and link function to be used in the model. Default is "gaussian". Currently supports: gaussian with identity link, poisson and binomial. -#' @param subset an optional `vector` specifying a subset of observations to be used in the fitting process. -#' @param strata an optional `vector` specifying strata. +#' @param pop_size an optional `double` value with population size. +#' @param method_selection a `character` indicating the method for propensity scores estimation. +#' @param method_outcome a `character` indicating the method for response variable estimation. +#' @param family_outcome a `character` string describing the error distribution and the link function to be used in the model, set to `gaussian` by default. Currently supports: gaussian with identity link, poisson and binomial. +#' @param subset an optional `vector` specifying a subset of observations to be used in the fitting process - not yet supported. +#' @param strata an optional `vector` specifying strata - not yet supported. #' @param weights an optional `vector` of prior weights to be used in the fitting process. Should be NULL or a numeric vector. It is assumed that this vector contains frequency or analytic weights. -#' @param na_action a function which indicates what should happen when the data contain `NAs`. -#' @param control_selection a `list` indicating parameters to use in fitting selection model for propensity scores. -#' @param control_outcome a `list` indicating parameters to use in fitting model for outcome variable. -#' @param control_inference a `list` indicating parameters to use in inference based on probability and non-probability samples, contains parameters such as estimation method or variance method. +#' @param na_action a function which indicates what should happen when the data contain `NAs` - not yet supported. +#' @param control_selection a `list` indicating parameters to be used when fitting the selection model for propensity scores. +#' @param control_outcome a `list` indicating parameters to be used when fitting the model for the outcome variable. +#' @param control_inference a `list` indicating parameters to be used for inference based on probability and non-probability samples, contains parameters such as the estimation method or the variance method. #' @param start_selection an optional `vector` with starting values for the parameters of the selection equation. #' @param start_outcome an optional `vector` with starting values for the parameters of the outcome equation. #' @param verbose verbose, numeric. -#' @param x Logical value indicating whether to return model matrix of covariates as a part of output. -#' @param y Logical value indicating whether to return vector of outcome variable as a part of output. +#' @param x a logical value indicating whether to return model matrix of covariates as a part of the output. +#' @param y a logical value indicating whether to return vector of the outcome variable as a part of the output. #' @param se Logical value indicating whether to calculate and return standard error of estimated mean. #' @param ... Additional, optional arguments. #' @@ -104,7 +104,7 @@ NULL #' Using the imputed values for the probability sample and the (known) design weights, #' we can build a population mean estimator of the form: #' \mjsdeqn{\hat{\mu}_{MI} = \frac{1}{N^B}\sum_{i \in S_{B}} d_{i}^{B} \hat{y}_i.} -#' It opens the the door to a very flexible method for imputation models. The package uses generalized linear models from [stats::glm()], +#' It opens the door to a very flexible method for imputation models. The package uses generalized linear models from [stats::glm()], #' the nearest neighbour algorithm using [RANN::nn2()] and predictive mean matching. #' #' 3. Doubly robust estimation -- The IPW and MI estimators are sensitive to misspecified models for the propensity score and outcome variable, respectively. @@ -153,7 +153,7 @@ NULL #' } #' where \mjseqn{\lambda_{\theta}} and \mjseqn{q_{\lambda_{\beta}}} are some smooth functions. We let \mjseqn{q_{\lambda} \left(x\right) = \frac{\partial p_{\lambda}}{\partial x}}, where \mjseqn{p_{\lambda}} is some penalization function. #' Details of penalization functions and techniques for solving this type of equation can be found [here](https://ncn-foreigners.github.io/nonprobsvy-book/variableselection.html). -#' To use the variable selection model, set the `vars_selection` parameter in the [controlInf()] function to `TRUE`. In addition, in the other control functions such as [controlSel()] and [controlOut()] +#' To use the variable selection model, set the `vars_selection` parameter in the [control_inf()] function to `TRUE`. In addition, in the other control functions such as [control_sel()] and [control_out()] #' you can set parameters for the selection of the relevant variables, such as the number of folds during cross-validation algorithm or the lambda value for penalizations. Details can be found #' in the documentation of the control functions for `nonprob`. #' @@ -194,7 +194,7 @@ NULL #' \item{\code{pop_size} -- estimated population size derived from estimated weights (non-probability sample) or known design weights (probability sample).} #' \item{\code{pop_totals} -- the total values of the auxiliary variables derived from a probability sample or vector of total/mean values.} #' \item{\code{outcome} -- list containing information about the fitting of the mass imputation model, in the case of regression model the object containing the list returned by -#' [stats::glm()], in the case of the nearest neighbour imputation the object containing list returned by [RANN::nn2()]. If `bias_correction` in [controlInf()] is set to `TRUE`, the estimation is based on +#' [stats::glm()], in the case of the nearest neighbour imputation the object containing list returned by [RANN::nn2()]. If `bias_correction` in [control_inf()] is set to `TRUE`, the estimation is based on #' the joint estimating equations for the `selection` and `outcome` model and therefore, the list is different from the one returned by the [stats::glm()] function and contains elements such as #' \itemize{ #' \item{\code{coefficients} -- estimated coefficients of the regression model.} @@ -244,7 +244,7 @@ NULL #' } #' \item{\code{stat} -- matrix of the estimated population means in each bootstrap iteration. #' Returned only if a bootstrap method is used to estimate the variance and \code{keep_boot} in -#' [controlInf()] is set on `TRUE`.} +#' [control_inf()] is set on `TRUE`.} #' } #' @seealso #' [stats::optim()] -- For more information on the \code{optim} function used in the @@ -263,11 +263,11 @@ NULL #' #' [RANN::nn2()] -- For more information about the nearest neighbour algorithm used during mass imputation process. #' -#' [controlSel()] -- For the control parameters related to selection model. +#' [control_sel()] -- For the control parameters related to selection model. #' -#' [controlOut()] -- For the control parameters related to outcome model. +#' [control_out()] -- For the control parameters related to outcome model. #' -#' [controlInf()] -- For the control parameters related to statistical inference. +#' [control_inf()] -- For the control parameters related to statistical inference. #' @examples #' \donttest{ diff --git a/R/nonprob.R b/R/nonprob.R index b0e4c01..60309d6 100644 --- a/R/nonprob.R +++ b/R/nonprob.R @@ -15,9 +15,9 @@ nonprob <- function(data, strata = NULL, weights = NULL, na_action = NULL, - control_selection = controlSel(), - control_outcome = controlOut(), - control_inference = controlInf(), + control_selection = control_sel(), + control_outcome = control_out(), + control_inference = control_inf(), start_selection = NULL, start_outcome = NULL, verbose = FALSE, @@ -38,7 +38,8 @@ nonprob <- function(data, if (missing(method_outcome)) method_outcome <- "glm" if (!(method_outcome %in% c("glm", "nn", "pmm"))) stop("Invalid method for outcome variable.") if (!is.null(svydesign)) { - if (class(svydesign)[2] != "survey.design") stop("svydesign must be a survey.design object.") + if ("svyrep.design" %in% class(svydesign)) stop("We do not currently support the `svyrep.design` class. Provide the survey data in the `survey.design2` class.") + if ("pps" %in% class(svydesign)) stop("The `as.svrepdesign` function does not allow `pps` designs. For more details, see the `survey` package.") } if (!is.null(pop_totals)) { if (!is.vector(pop_totals)) stop("pop_totals must be a vector.") @@ -50,7 +51,7 @@ nonprob <- function(data, if (!(family_outcome %in% c("gaussian", "binomial", "poisson"))) stop("Invalid family for outcome formula.") if (!is.null(control_selection$key)) { if (!(control_selection$key %in% colnames(data)) || !(control_selection$key %in% colnames(svydesign$variables))) { - stop("key variable for overlapping units must be defined with this same name in prob and nonprob sample.") + stop("Key variable for overlapping units must be defined with this same name in prob and nonprob sample.") } } diff --git a/R/nonprobDR.R b/R/nonprobDR.R index ce4bb13..c6f198f 100644 --- a/R/nonprobDR.R +++ b/R/nonprobDR.R @@ -27,9 +27,9 @@ nonprobDR <- function(selection, strata, weights, na_action, - control_selection = controlSel(), - control_outcome = controlOut(), - control_inference = controlInf(), + control_selection, + control_outcome, + control_inference, start_outcome, start_selection, verbose, diff --git a/R/nonprobIPW.R b/R/nonprobIPW.R index 511586d..ff0f14e 100644 --- a/R/nonprobIPW.R +++ b/R/nonprobIPW.R @@ -23,8 +23,8 @@ nonprobIPW <- function(selection, strata, weights, na_action, - control_selection = controlSel(), - control_inference = controlInf(), + control_selection, + control_inference, start_selection, verbose, x, diff --git a/R/nonprobMI.R b/R/nonprobMI.R index cc5e0f8..45c83cc 100644 --- a/R/nonprobMI.R +++ b/R/nonprobMI.R @@ -26,7 +26,7 @@ nonprobMI <- function(outcome, weights, na_action, control_outcome, - control_inference = controlInf(var_method = "analytic"), + control_inference, start_outcome, verbose, x, diff --git a/R/probitModel.R b/R/probitModel.R index b00344b..eae8435 100644 --- a/R/probitModel.R +++ b/R/probitModel.R @@ -15,6 +15,7 @@ #' @importFrom stats dnorm #' @importFrom Matrix Matrix #' @importFrom survey svyrecvar +#' @keywords internal #' @export # must be exported to be visible in c++ script, to consider any other option probit_model_nonprobsvy <- function(...) { @@ -115,7 +116,7 @@ probit_model_nonprobsvy <- function(...) { if (maxLik_an$code %in% c(3:7, 100)) { switch(as.character(maxLik_an$code), - "3" = warning("warning in fitting selection model with maxLik: probably not converged."), + "3" = warning("Warning in fitting selection model with maxLik: probably not converged."), "4" = warning("Maxiteration limit reached in fitting selection model by maxLik."), "5" = stop("Inifinite value of log_like in fitting selection model by maxLik, error code 5"), "6" = stop("Inifinite value of gradient in fitting selection model by maxLik, error code 6"), @@ -143,10 +144,10 @@ probit_model_nonprobsvy <- function(...) { ) if (maxLik_an$convergence %in% c(1, 10, 51, 52)) { switch(as.character(maxLik_an$convergence), - "1" = warning("warning in fitting selection model with optim: the iteration limit maxit had been reached."), - "10" = warning("degeneracy of the Nelder Mead simplex in fitting selection model by optim."), # TODO - - "51" = warning("warning from the L BFGS B when fitting by optim."), # TODO - - "52" = stop("indicates an error from the L-BFGS-B method when fitting by optim.") + "1" = warning("Warning in fitting selection model with optim: the iteration limit maxit had been reached."), + "10" = warning("Degeneracy of the Nelder Mead simplex in fitting selection model by optim."), # TODO - + "51" = warning("Warning from the L BFGS B when fitting by optim."), # TODO - + "52" = stop("Indicates an error from the L-BFGS-B method when fitting by optim.") ) } diff --git a/R/simple_methods.R b/R/simple_methods.R index 11bcfa2..1dc902f 100644 --- a/R/simple_methods.R +++ b/R/simple_methods.R @@ -7,9 +7,9 @@ nobs.nonprobsvy <- function(object, ...) { c("prob" = object$prob_size, "nonprob" = object$nonprob_size) } -#' @method pop.size nonprobsvy +#' @method pop_size nonprobsvy #' @exportS3Method -pop.size.nonprobsvy <- function(object, +pop_size.nonprobsvy <- function(object, ...) { object$pop_size } @@ -19,8 +19,8 @@ pop.size.nonprobsvy <- function(object, #' @param ... additional parameters #' @return Vector returning the value of the estimated population size. #' @export -pop.size <- function(object, ...) { - UseMethod("pop.size") +pop_size <- function(object, ...) { + UseMethod("pop_size") } #' @method residuals nonprobsvy #' @importFrom stats residuals @@ -321,9 +321,9 @@ deviance.nonprobsvy <- function(object, if (class(object)[2] == "nonprobsvy_dr") res <- c("selection" = res_sel, "outcome" = res_out) res } -#' @method nonprobsvycheck nonprobsvy +#' @method check_balance nonprobsvy #' @exportS3Method -nonprobsvycheck.nonprobsvy <- function(x, object, dig = 10) { +check_balance.nonprobsvy <- function(x, object, dig = 10) { # Input validation if (!inherits(x, "formula")) { stop("'x' must be a formula") @@ -443,8 +443,8 @@ nonprobsvycheck.nonprobsvy <- function(x, object, dig = 10) { #' @importFrom survey svytotal #' @importFrom stats setNames #' @export -nonprobsvycheck <- function(x, object, dig) { - UseMethod("nonprobsvycheck", object) +check_balance <- function(x, object, dig) { + UseMethod("check_balance", object) } # Internal function - not exported in CRAN version # Will be exported in future releases after variance estimation is implemented diff --git a/R/summary.R b/R/summary.R index 3406c6d..836eea2 100644 --- a/R/summary.R +++ b/R/summary.R @@ -111,7 +111,7 @@ summary.nonprobsvy <- function(object, cnf_int = object$confidence_interval ), sample_size = nobs(object, ...), - population_size = pop.size(object, ...), + population_size = pop_size(object, ...), totals = object$pop_totals, test = test, control = object$control, diff --git a/R/theta_funcs.R b/R/theta_funcs.R index e5251ec..b07dab9 100644 --- a/R/theta_funcs.R +++ b/R/theta_funcs.R @@ -138,10 +138,10 @@ theta_h_estimation <- function(R, "2" = warning("Relatively convergent algorithm when fitting selection model by nleqslv, but user must check if function values are acceptably small."), "3" = warning("Algorithm did not find suitable point - has stalled cannot find an acceptable new point when fitting selection model by nleqslv."), "4" = warning("Iteration limit exceeded when fitting selection model by nleqslv."), - "5" = warning("ill-conditioned Jacobian when fitting selection model by nleqslv."), + "5" = warning("Ill-conditioned Jacobian when fitting selection model by nleqslv."), "6" = warning("Jacobian is singular when fitting selection model by nleqslv."), "7" = warning("Jacobian is unusable when fitting selection model by nleqslv."), - "-10" = warning("user specified Jacobian is incorrect when fitting selection model by nleqslv.") + "-10" = warning("User specified Jacobian is incorrect when fitting selection model by nleqslv.") ) } theta_h <- as.vector(theta_root) diff --git a/README.Rmd b/README.Rmd index 3cb942e..f1a939e 100644 --- a/README.Rmd +++ b/README.Rmd @@ -44,7 +44,7 @@ population or probability sample is available: - inverse probability weighting estimators with possible calibration constraints [@chen2020], -- mass imputation estimators based in nearest neighbours [@yang2021], +- mass imputation estimators based on nearest neighbours [@yang2021], predictive mean matching and regression imputation [@kim2021], - doubly robust estimators with bias minimization [@chen2020, @yang2020]. @@ -61,12 +61,12 @@ The package allows for: - different links for selection (`logit`, `probit` and `cloglog`) and outcome (`gaussian`, `binomial` and `poisson`) variables. -Details on use of the package be found: +Details on the use of the package can be found: -- on the draft (and not proofread) version the book [Modern inference +- in the draft (and not proofread) version of the book [Modern inference methods for non-probability samples with R](https://ncn-foreigners.github.io/nonprobsvy-book/), -- example codes that reproduce papers are available at github in the +- in example codes that reproduce papers available on github in the repository [software tutorials](https://github.com/ncn-foreigners/software-tutorials). diff --git a/README.md b/README.md index b3bab90..a29c5ee 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ population or probability sample is available: - inverse probability weighting estimators with possible calibration constraints ([Chen, Li, and Wu 2020](#ref-chen2020)), -- mass imputation estimators based in nearest neighbours ([Yang, Kim, +- mass imputation estimators based on nearest neighbours ([Yang, Kim, and Hwang 2021](#ref-yang2021)), predictive mean matching and regression imputation ([Kim et al. 2021](#ref-kim2021)), - doubly robust estimators with bias minimization Yang, Kim, and Song @@ -46,12 +46,12 @@ The package allows for: - different links for selection (`logit`, `probit` and `cloglog`) and outcome (`gaussian`, `binomial` and `poisson`) variables. -Details on use of the package be found: +Details on the use of the package can be found: -- on the draft (and not proofread) version the book [Modern inference +- in the draft (and not proofread) version of the book [Modern inference methods for non-probability samples with R](https://ncn-foreigners.github.io/nonprobsvy-book/), -- example codes that reproduce papers are available at github in the +- in example codes that reproduce papers available on github in the repository [software tutorials](https://github.com/ncn-foreigners/software-tutorials). @@ -579,13 +579,13 @@ summary(result_ipw) #> target = ~y1, svydesign = sample_prob) #> #> ------------------------- -#> Estimated population mean: 2.925 with overall std.err of: 0.05 +#> Estimated population mean: 2.925 with overall std.err of: 0.04999 #> And std.err for nonprobability and probability samples being respectively: -#> 0.001586 and 0.04997 +#> 0.001325 and 0.04997 #> #> 95% Confidence inverval for popualtion mean: #> lower_bound upper_bound -#> y1 2.82679 3.022776 +#> y1 2.826805 3.022761 #> #> #> Based on: Inverse probability weighted method diff --git a/inst/tinytest/test-2-ipw-totals.R b/inst/tinytest/test-2-ipw-totals.R index cfd6ac1..9b76125 100644 --- a/inst/tinytest/test-2-ipw-totals.R +++ b/inst/tinytest/test-2-ipw-totals.R @@ -208,8 +208,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y11_corr_scad$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -225,8 +225,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y12_corr_scad$output$mean, 6.9530644, tolerance = 0.0001) ## true value for this sim @@ -242,8 +242,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y21_corr_scad$output$mean, 0.78264707, tolerance = 0.0001) ## true value for this sim @@ -259,8 +259,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y22_corr_scad$output$mean, 0.57680653, tolerance = 0.0001) ## true value for this sim @@ -278,8 +278,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "logit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "lasso")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "lasso")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -296,8 +296,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "logit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "MCP")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "MCP")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -314,8 +314,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "logit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "SCAD", nfolds = 5), + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "SCAD", nfolds = 5), # verbose = T) # ) @@ -517,8 +517,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y11_corr_scad$output$mean, 1.8810431, tolerance = 0.0001) ## true value for this sim @@ -534,8 +534,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y12_corr_scad$output$mean, 5.796713, tolerance = 0.0001) ## true value for this sim @@ -551,8 +551,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y21_corr_scad$output$mean, 0.6060074, tolerance = 0.0001) ## true value for this sim @@ -568,8 +568,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y22_corr_scad$output$mean, 0.64707641, tolerance = 0.0001) ## true value for this sim @@ -587,8 +587,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "logit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "lasso")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "lasso")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -605,8 +605,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "logit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "MCP")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "MCP")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -623,8 +623,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "logit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "SCAD", nfolds = 5), + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "SCAD", nfolds = 5), # verbose = T) # ) @@ -825,8 +825,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y11_corr_scad$output$mean, 3.0633399, tolerance = 0.0001) ## true value for this sim @@ -842,8 +842,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y12_corr_scad$output$mean, 6.9420676, tolerance = 0.0001) ## true value for this sim @@ -859,8 +859,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y21_corr_scad$output$mean, 0.78324543, tolerance = 0.0001) ## true value for this sim @@ -876,8 +876,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y22_corr_scad$output$mean, 0.57672297, tolerance = 0.0001) ## true value for this sim @@ -895,8 +895,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "probit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "lasso")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "lasso")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -913,8 +913,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "probit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "MCP")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "MCP")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -931,8 +931,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "probit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "SCAD", nfolds = 5), + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "SCAD", nfolds = 5), # verbose = T) # ) @@ -1133,8 +1133,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y11_corr_scad$output$mean, 1.8810431, tolerance = 0.0001) ## true value for this sim @@ -1150,8 +1150,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y12_corr_scad$output$mean, 5.7967136, tolerance = 0.0001) ## true value for this sim @@ -1167,8 +1167,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y21_corr_scad$output$mean, 0.60600756, tolerance = 0.0001) ## true value for this sim @@ -1184,8 +1184,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B2, pop_totals = X_totals[1:11], method_selection = "probit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) expect_equal(y22_corr_scad$output$mean, 0.64707626, tolerance = 0.0001) ## true value for this sim @@ -1203,8 +1203,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "probit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "lasso")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "lasso")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -1221,8 +1221,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "probit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "MCP")) + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "MCP")) # ) # # expect_equal(y11_corr_lasso$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -1239,8 +1239,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") # data = sample_B1, # pop_totals = X_totals[1:11], # method_selection = "probit", - # control_inference = controlInf(vars_selection = TRUE), - # control_selection = controlSel(penalty = "SCAD", nfolds = 5), + # control_inference = control_inf(vars_selection = TRUE), + # control_selection = control_sel(penalty = "SCAD", nfolds = 5), # verbose = T) # ) @@ -1439,8 +1439,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # expect_equal(y11_corr_scad$output$mean, 3.063926, tolerance = 0.0001) ## true value for this sim @@ -1456,8 +1456,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # expect_equal(y12_corr_scad$output$mean, 6.9530644, tolerance = 0.0001) ## true value for this sim @@ -1473,8 +1473,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # expect_equal(y21_corr_scad$output$mean, 0.78264707, tolerance = 0.0001) ## true value for this sim @@ -1490,8 +1490,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = sample_B1, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # expect_equal(y22_corr_scad$output$mean, 0.57680653, tolerance = 0.0001) ## true value for this sim @@ -1694,8 +1694,8 @@ expect_silent( data = sample_B2, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # # expect_equal(y11_corr_scad$output$mean, 1.992688, tolerance = 0.0001) ## true value for this sim @@ -1711,8 +1711,8 @@ expect_silent( data = sample_B2, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # # expect_equal(y12_corr_scad$output$mean, 5.712705, tolerance = 0.0001) ## true value for this sim @@ -1728,8 +1728,8 @@ expect_silent( data = sample_B2, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # # expect_equal(y21_corr_scad$output$mean, 0.5955036, tolerance = 0.0001) ## true value for this sim @@ -1745,8 +1745,8 @@ expect_silent( data = sample_B2, pop_totals = X_totals[1:11], method_selection = "cloglog", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD", nfolds = 5)) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD", nfolds = 5)) ) # # expect_equal(y22_corr_scad$output$mean, 0.621987, tolerance = 0.0001) ## true value for this sim diff --git a/inst/tinytest/test-3-mi-totals.R b/inst/tinytest/test-3-mi-totals.R index 54a6367..ff7f395 100644 --- a/inst/tinytest/test-3-mi-totals.R +++ b/inst/tinytest/test-3-mi-totals.R @@ -170,8 +170,8 @@ expect_silent( pop_totals = X_totals[1:11], method_outcome = "glm", family_outcome = "gaussian", - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) @@ -189,8 +189,8 @@ expect_silent( pop_totals = X_totals[1:11], method_outcome = "glm", family_outcome = "gaussian", - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) @@ -209,8 +209,8 @@ expect_silent( pop_totals = X_totals[1:11], method_outcome = "glm", family_outcome = "binomial", - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) @@ -229,8 +229,8 @@ expect_silent( pop_totals = X_totals[1:11], method_outcome = "glm", family_outcome = "binomial", - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) diff --git a/inst/tinytest/test-4-dr-totals.R b/inst/tinytest/test-4-dr-totals.R index 61e57f7..68f4892 100644 --- a/inst/tinytest/test-4-dr-totals.R +++ b/inst/tinytest/test-4-dr-totals.R @@ -185,9 +185,9 @@ expect_silent( method_selection = "logit", method_outcome = "glm", family_outcome = "gaussian", - control_selection = controlSel(penalty = "SCAD", nfolds = 5), - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_selection = control_sel(penalty = "SCAD", nfolds = 5), + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) @@ -208,9 +208,9 @@ expect_silent( method_selection = "logit", method_outcome = "glm", family_outcome = "gaussian", - control_selection = controlSel(penalty = "SCAD", nfolds = 5), - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_selection = control_sel(penalty = "SCAD", nfolds = 5), + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) @@ -231,9 +231,9 @@ expect_silent( method_selection = "logit", method_outcome = "glm", family_outcome = "binomial", - control_selection = controlSel(penalty = "SCAD", nfolds = 5), - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_selection = control_sel(penalty = "SCAD", nfolds = 5), + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) @@ -255,9 +255,9 @@ expect_silent( method_selection = "logit", method_outcome = "glm", family_outcome = "binomial", - control_selection = controlSel(penalty = "SCAD", nfolds = 5), - control_outcome = controlOut(penalty = "SCAD", nfolds = 5), - control_inference = controlInf(vars_selection = TRUE) + control_selection = control_sel(penalty = "SCAD", nfolds = 5), + control_outcome = control_out(penalty = "SCAD", nfolds = 5), + control_inference = control_inf(vars_selection = TRUE) ) ) diff --git a/inst/tinytest/test_methods.R b/inst/tinytest/test_methods.R index 136fc3e..a326df4 100644 --- a/inst/tinytest/test_methods.R +++ b/inst/tinytest/test_methods.R @@ -1,216 +1,216 @@ # S3methods tests # test simulate # set.seed(123) -# source_nonprob_p <- read.csv("test1_nonprob.csv") -# sample_a <- read.csv("test1_prob.csv") -# svy_a <- svydesign(ids= ~1, weights = ~ w_a, data = sample_a) -# -# test1a <- nonprob(selection = ~ x, -# target = ~ y1, -# data = source_nonprob_p, -# method_selection = "logit", -# svydesign = svy_a) -# -# expect_silent( -# summary(test1a) -# ) -# -# expect_silent( -# nobs(test1a) -# ) -# -# expect_silent( -# pop.size(test1a) -# ) -# -# expect_silent( -# residuals(test1a) -# ) -# -# expect_silent( -# residuals(test1a, type = "pearson") -# ) -# -# expect_silent( -# residuals(test1a, type = "working") -# ) -# -# expect_silent( -# residuals(test1a, type = "deviance") -# ) -# -# expect_silent( -# residuals(test1a, "pearsonSTD") -# ) - -# expect_silent( -# cooks.distance(test1a) -# ) -# -# expect_silent( -# hatvalues(test1a) -# ) - -# expect_silent( -# logLik(test1a) -# ) -# -# expect_silent( -# AIC(test1a) -# ) -# -# expect_silent( -# BIC(test1a) -# ) -# -# expect_silent( -# confint(test1a) -# ) -# -# expect_silent( -# vcov(test1a) -# ) -# -# expect_silent( -# deviance(test1a) -# ) - -# test2a <- nonprob(outcome = y1 ~ x, -# data = source_nonprob_p, -# method_selection = "logit", -# svydesign = svy_a) -# expect_silent( -# summary(test2a) -# ) -# -# expect_silent( -# nobs(test2a) -# ) -# -# expect_silent( -# pop.size(test2a) -# ) -# -# expect_silent( -# residuals(test2a) -# ) -# -# expect_silent( -# residuals(test2a, type = "pearson") -# ) -# -# expect_silent( -# residuals(test2a, type = "working") -# ) - -# expect_silent( -# residuals(test2a, type = "deviance") -# ) -# -# expect_silent( -# residuals(test2a, "pearsonSTD") -# ) - -# if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true")) { -# expect_silent( -# cooks.distance(test2a) -# ) -# expect_silent( -# hatvalues(test2a) -# ) -# } - -# expect_silent( -# logLik(test2a) -# ) -# -# expect_silent( -# AIC(test2a) -# ) -# -# expect_silent( -# BIC(test2a) -# ) -# -# expect_silent( -# confint(test2a) -# ) -# -# expect_silent( -# vcov(test2a) -# ) -# -# expect_silent( -# deviance(test2a) -# ) -# -# test3a <- nonprob(outcome = y1 ~ x, -# selection = ~ x, -# data = source_nonprob_p, -# method_selection = "logit", -# svydesign = svy_a) -# expect_silent( -# summary(test3a) -# ) -# -# expect_silent( -# nobs(test3a) -# ) -# -# expect_silent( -# pop.size(test3a) -# ) -# -# expect_silent( -# residuals(test3a) -# ) -# -# expect_silent( -# residuals(test3a, type = "pearson") -# ) -# -# expect_silent( -# residuals(test3a, type = "working") -# ) -# -# expect_silent( -# residuals(test3a, type = "deviance") -# ) -# -# expect_silent( -# residuals(test3a, "pearsonSTD") -# ) - -# if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true")) { -# expect_silent( -# cooks.distance(test3a) -# ) -# -# expect_silent( -# hatvalues(test3a) -# ) -# } - -# expect_silent( -# logLik(test3a) -# ) -# -# expect_silent( -# AIC(test3a) -# ) -# -# expect_silent( -# BIC(test3a) -# ) -# -# expect_silent( -# confint(test3a) -# ) -# -# expect_silent( -# vcov(test3a) -# ) -# -# expect_silent( -# deviance(test3a) -# ) +source_nonprob_p <- read.csv("test1_nonprob.csv") +sample_a <- read.csv("test1_prob.csv") +svy_a <- svydesign(ids= ~1, weights = ~ w_a, data = sample_a) + +test1a <- nonprob(selection = ~ x, + target = ~ y1, + data = source_nonprob_p, + method_selection = "logit", + svydesign = svy_a) + +expect_silent( + summary(test1a) +) + +expect_silent( + nobs(test1a) +) + +expect_silent( + pop_size(test1a) +) + +expect_silent( + residuals(test1a) +) +# +expect_silent( + residuals(test1a, type = "pearson") +) + +expect_silent( + residuals(test1a, type = "working") +) + +expect_silent( + residuals(test1a, type = "deviance") +) + +expect_silent( + residuals(test1a, "pearsonSTD") +) + +expect_silent( + cooks.distance(test1a) +) + +expect_silent( + hatvalues(test1a) +) + +expect_silent( + logLik(test1a) +) + +expect_silent( + AIC(test1a) +) +# +expect_silent( + BIC(test1a) +) +# +expect_silent( + confint(test1a) +) +# +expect_silent( + vcov(test1a) +) +# +expect_silent( + deviance(test1a) +) + +test2a <- nonprob(outcome = y1 ~ x, + data = source_nonprob_p, + method_selection = "logit", + svydesign = svy_a) +expect_silent( + summary(test2a) +) +# +expect_silent( + nobs(test2a) +) + +expect_silent( + pop_size(test2a) +) +# +expect_silent( + residuals(test2a) +) + +expect_silent( + residuals(test2a, type = "pearson") +) +# +expect_silent( + residuals(test2a, type = "working") +) + +expect_silent( + residuals(test2a, type = "deviance") +) + +expect_silent( + residuals(test2a, "pearsonSTD") +) + +if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true")) { + expect_silent( + cooks.distance(test2a) + ) + expect_silent( + hatvalues(test2a) + ) +} + +expect_silent( + logLik(test2a) +) + +expect_silent( + AIC(test2a) +) +# +expect_silent( + BIC(test2a) +) +# +expect_silent( + confint(test2a) +) +# +expect_silent( + vcov(test2a) +) +# +expect_silent( + deviance(test2a) +) +# +test3a <- nonprob(outcome = y1 ~ x, + selection = ~ x, + data = source_nonprob_p, + method_selection = "logit", + svydesign = svy_a) +expect_silent( + summary(test3a) +) +# +expect_silent( + nobs(test3a) +) + +expect_silent( + pop_size(test3a) +) +# +expect_silent( + residuals(test3a) +) + +expect_silent( + residuals(test3a, type = "pearson") +) +# +expect_silent( + residuals(test3a, type = "working") +) + +expect_silent( + residuals(test3a, type = "deviance") +) + +expect_silent( + residuals(test3a, "pearsonSTD") +) + +if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true")) { + expect_silent( + cooks.distance(test3a) + ) + + expect_silent( + hatvalues(test3a) + ) +} + +expect_silent( + logLik(test3a) +) + +expect_silent( + AIC(test3a) +) + +expect_silent( + BIC(test3a) +) + +expect_silent( + confint(test3a) +) + +expect_silent( + vcov(test3a) +) + +expect_silent( + deviance(test3a) +) diff --git a/inst/tinytest/test_nonprobsvy.R b/inst/tinytest/test_nonprobsvy.R index 816a9e0..f606afe 100644 --- a/inst/tinytest/test_nonprobsvy.R +++ b/inst/tinytest/test_nonprobsvy.R @@ -260,7 +260,7 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = source_nonprob_p, method_selection = "logit", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1)) ) @@ -270,7 +270,7 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = source_nonprob_p, method_selection = "logit", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1)) ) @@ -278,7 +278,7 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") test3a_bootstrap <- nonprob(outcome = y1 ~ x, data = source_nonprob_p, svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1)) ) expect_silent( @@ -288,7 +288,7 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") method_selection = "logit", method_outcome = "nn", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1)) ) @@ -297,7 +297,7 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = source_nonprob_p, method_outcome = "nn", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1)) ) expect_silent( @@ -307,7 +307,7 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") method_selection = "logit", method_outcome = "pmm", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1)) ) @@ -316,8 +316,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = source_nonprob_p, method_outcome = "pmm", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1), - control_outcome = controlOut(predictive_match = 1)) + control_inference = control_inf(var_method = "bootstrap", cores = 1), + control_outcome = control_out(predictive_match = 1)) ) expect_silent( @@ -325,8 +325,8 @@ if (isTRUE(tolower(Sys.getenv("TEST_NONPROBSVY_MULTICORE_DEVELOPER")) == "true") data = source_nonprob_p, method_outcome = "pmm", svydesign = svy_a, - control_inference = controlInf(var_method = "bootstrap", cores = 1), - control_outcome = controlOut(predictive_match = 2)) + control_inference = control_inf(var_method = "bootstrap", cores = 1), + control_outcome = control_out(predictive_match = 2)) ) diff --git a/inst/tinytest/test_nonprobsvy_main.R b/inst/tinytest/test_nonprobsvy_main.R index 5fa4a68..22086bf 100644 --- a/inst/tinytest/test_nonprobsvy_main.R +++ b/inst/tinytest/test_nonprobsvy_main.R @@ -71,8 +71,8 @@ expect_silent( data = cbop_df_long, svydesign = popyt_svy, method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD")) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD")) ) expect_equivalent(test_ipw_1_scad$output$mean, @@ -90,8 +90,8 @@ expect_silent( data = cbop_df_long, svydesign = popyt_svy, method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "lasso")) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "lasso")) ) expect_equivalent(test_ipw_1_lasso$output$mean, @@ -109,8 +109,8 @@ expect_silent( data = cbop_df_long, svydesign = popyt_svy, method_selection = "logit", - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "MCP")) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "MCP")) ) expect_equivalent(test_ipw_1_mcp$output$mean, @@ -143,7 +143,7 @@ test_dr_2 <- nonprob(selection = ~ klasa_pr, outcome = jedna_zmiana ~ klasa_pr, data = cbop_df, pop_totals = pop_totals, - control_inference = controlInf(var_method = "analytic")) # TODO warning to connected to algorithm convergence + control_inference = control_inf(var_method = "analytic")) # TODO warning to connected to algorithm convergence expect_equivalent(test_dr_2$output$mean, 0.6747754, @@ -191,7 +191,7 @@ expect_silent( selection = ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, - control_inference = controlInf(bias_correction = TRUE)) + control_inference = control_inf(bias_correction = TRUE)) ) expect_equivalent(test_dr_1_bm$output$mean, @@ -209,9 +209,9 @@ expect_silent( selection = ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "SCAD"), - control_outcome = controlOut(penalty = "SCAD")) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "SCAD"), + control_outcome = control_out(penalty = "SCAD")) ) expect_equivalent(test_dr_1_scad$output$mean, @@ -229,9 +229,9 @@ expect_silent( selection = ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "lasso"), - control_outcome = controlOut(penalty = "lasso")) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "lasso"), + control_outcome = control_out(penalty = "lasso")) ) expect_equivalent(test_dr_1_lasso$output$mean, @@ -249,9 +249,9 @@ expect_silent( selection = ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, - control_inference = controlInf(vars_selection = TRUE), - control_selection = controlSel(penalty = "MCP"), - control_outcome = controlOut(penalty = "MCP")) + control_inference = control_inf(vars_selection = TRUE), + control_selection = control_sel(penalty = "MCP"), + control_outcome = control_out(penalty = "MCP")) ) expect_equivalent(test_dr_1_mcp$output$mean, @@ -334,14 +334,14 @@ test_mi_1_pmm_2 <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 data = cbop_df_long, svydesign = popyt_svy, method_outcome = "pmm", - control_outcome = controlOut(predictive_match = 2)) + control_outcome = control_out(predictive_match = 2)) expect_silent( test_mi_1_pmm_2 <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, method_outcome = "pmm", - control_outcome = controlOut(predictive_match = 2)) + control_outcome = control_out(predictive_match = 2)) ) expect_equivalent(test_mi_1_pmm_2$output$mean, @@ -358,16 +358,16 @@ test_mi_1_scad <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 + data = cbop_df_long, svydesign = popyt_svy, method_outcome = "glm", - control_inference = controlInf(vars_selection = TRUE), - control_outcome = controlOut(penalty = "SCAD")) + control_inference = control_inf(vars_selection = TRUE), + control_outcome = control_out(penalty = "SCAD")) expect_silent( test_mi_1_scad <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, method_outcome = "glm", - control_inference = controlInf(vars_selection = TRUE), - control_outcome = controlOut(penalty = "SCAD")) + control_inference = control_inf(vars_selection = TRUE), + control_outcome = control_out(penalty = "SCAD")) ) expect_equivalent(test_mi_1_scad$output$mean, @@ -384,16 +384,16 @@ test_mi_1_lasso <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 data = cbop_df_long, svydesign = popyt_svy, method_outcome = "glm", - control_inference = controlInf(vars_selection = TRUE), - control_outcome = controlOut(penalty = "lasso")) + control_inference = control_inf(vars_selection = TRUE), + control_outcome = control_out(penalty = "lasso")) expect_silent( test_mi_1_lasso <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, method_outcome = "glm", - control_inference = controlInf(vars_selection = TRUE), - control_outcome = controlOut(penalty = "lasso")) + control_inference = control_inf(vars_selection = TRUE), + control_outcome = control_out(penalty = "lasso")) ) expect_equivalent(test_mi_1_lasso$output$mean, @@ -410,16 +410,16 @@ test_mi_1_mcp <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 + data = cbop_df_long, svydesign = popyt_svy, method_outcome = "glm", - control_inference = controlInf(vars_selection = TRUE), - control_outcome = controlOut(penalty = "MCP")) + control_inference = control_inf(vars_selection = TRUE), + control_outcome = control_out(penalty = "MCP")) expect_silent( test_mi_1_mcp <- nonprob(outcome = jedna_zmiana ~ klasa_pr + sek + zawod_kod2 + woj, data = cbop_df_long, svydesign = popyt_svy, method_outcome = "glm", - control_inference = controlInf(vars_selection = TRUE), - control_outcome = controlOut(penalty = "MCP")) + control_inference = control_inf(vars_selection = TRUE), + control_outcome = control_out(penalty = "MCP")) ) expect_equivalent(test_mi_1_mcp$output$mean, diff --git a/man/nonprobsvycheck.Rd b/man/check_balance.Rd similarity index 86% rename from man/nonprobsvycheck.Rd rename to man/check_balance.Rd index 9c3835d..1aa8b27 100644 --- a/man/nonprobsvycheck.Rd +++ b/man/check_balance.Rd @@ -1,10 +1,10 @@ % Generated by roxygen2: do not edit by hand % Please edit documentation in R/simple_methods.R -\name{nonprobsvycheck} -\alias{nonprobsvycheck} +\name{check_balance} +\alias{check_balance} \title{Check the balance between probability and non-probability samples} \usage{ -nonprobsvycheck(x, object, dig) +check_balance(x, object, dig) } \arguments{ \item{x}{Formula specifying variables to check} diff --git a/man/cloglog_model_nonprobsvy.Rd b/man/cloglog_model_nonprobsvy.Rd index 6bf4585..cbde79d 100644 --- a/man/cloglog_model_nonprobsvy.Rd +++ b/man/cloglog_model_nonprobsvy.Rd @@ -21,3 +21,4 @@ List with selected methods/objects/functions. \author{ Łukasz Chrostowski, Maciej Beręsewicz } +\keyword{internal} diff --git a/man/controlInf.Rd b/man/controlInf.Rd deleted file mode 100644 index 7ed1c89..0000000 --- a/man/controlInf.Rd +++ /dev/null @@ -1,68 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/control_inference.R -\name{controlInf} -\alias{controlInf} -\title{Control parameters for inference} -\usage{ -controlInf( - vars_selection = FALSE, - var_method = c("analytic", "bootstrap"), - rep_type = c("auto", "JK1", "JKn", "BRR", "bootstrap", "subbootstrap", "mrbbootstrap", - "Fay"), - bias_inf = c("union", "div"), - num_boot = 500, - bias_correction = FALSE, - alpha = 0.05, - cores = 1, - keep_boot, - nn_exact_se = FALSE, - pi_ij -) -} -\arguments{ -\item{vars_selection}{If \code{TRUE}, then variables selection model is used.} - -\item{var_method}{variance method.} - -\item{rep_type}{replication type for weights in the bootstrap method for variance estimation passed to \code{\link[survey:as.svrepdesign]{survey::as.svrepdesign()}}. -Default is \code{subbootstrap}.} - -\item{bias_inf}{inference method in the bias minimization. -\itemize{ -\item if \code{union} then final model is fitting on union of selected variables for selection and outcome models -\item if \code{div} then final model is fitting separately on division of selected variables into relevant ones for -selection and outcome model. -}} - -\item{num_boot}{number of iteration for bootstrap algorithms.} - -\item{bias_correction}{if \code{TRUE}, then bias minimization estimation used during fitting the model.} - -\item{alpha}{Significance level, Default is 0.05.} - -\item{cores}{Number of cores in parallel computing.} - -\item{keep_boot}{Logical indicating whether statistics from bootstrap should be kept. -By default set to \code{TRUE}} - -\item{nn_exact_se}{Logical value indicating whether to compute the exact -standard error estimate for \code{nn} or \code{pmm} estimator. The variance estimator for -estimation based on \code{nn} or \code{pmm} can be decomposed into three parts, with the -third being computed using covariance between imputed values for units in -probability sample using predictive matches from non-probability sample. -In most situations this term is negligible and is very computationally -expensive so by default this is set to \code{FALSE}, but it is recommended to -set this value to \code{TRUE} before submitting final results.} - -\item{pi_ij}{TODO, either matrix or \code{ppsmat} class object.} -} -\value{ -List with selected parameters. -} -\description{ -\code{controlInf} constructs a list with all necessary control parameters -for statistical inference. -} -\seealso{ -\code{\link[=nonprob]{nonprob()}} -- for fitting procedure with non-probability samples. -} diff --git a/man/control_inf.Rd b/man/control_inf.Rd new file mode 100644 index 0000000..d40f468 --- /dev/null +++ b/man/control_inf.Rd @@ -0,0 +1,68 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/control_inference.R +\name{control_inf} +\alias{control_inf} +\title{Control parameters for inference} +\usage{ +control_inf( + vars_selection = FALSE, + var_method = c("analytic", "bootstrap"), + rep_type = c("auto", "JK1", "JKn", "BRR", "bootstrap", "subbootstrap", "mrbbootstrap", + "Fay"), + bias_inf = c("union", "div"), + num_boot = 500, + bias_correction = FALSE, + alpha = 0.05, + cores = 1, + keep_boot, + nn_exact_se = FALSE, + pi_ij +) +} +\arguments{ +\item{vars_selection}{If \code{TRUE}, then the variables selection model is used.} + +\item{var_method}{the variance method.} + +\item{rep_type}{the replication type for weights in the bootstrap method for variance estimation passed to \code{\link[survey:as.svrepdesign]{survey::as.svrepdesign()}}. +Default is \code{subbootstrap}.} + +\item{bias_inf}{the inference method in the bias minimization. +\itemize{ +\item if \code{union}, then the final model is fitted on the union of selected variables for selection and outcome models +\item if \code{div}, then the final model is fitted separately on division of selected variables into relevant ones for +selection and outcome model. +}} + +\item{num_boot}{the number of iteration for bootstrap algorithms.} + +\item{bias_correction}{if \code{TRUE}, then the bias minimization estimation used during model fitting.} + +\item{alpha}{significance level, 0.05 by defult.} + +\item{cores}{the number of cores in parallel computing.} + +\item{keep_boot}{a logical value indicating whether statistics from bootstrap should be kept. +By default set to \code{TRUE}} + +\item{nn_exact_se}{a logical value indicating whether to compute the exact +standard error estimate for \code{nn} or \code{pmm} estimator. The variance estimator for +estimation based on \code{nn} or \code{pmm} can be decomposed into three parts, with the +third computed using covariance between imputed values for units in +the probability sample using predictive matches from the non-probability sample. +In most situations this term is negligible and is very computationally +expensive so by default it is set to \code{FALSE}, but the recommended option is to +set this value to \code{TRUE} before submitting the final results.} + +\item{pi_ij}{TODO, either a matrix or a \code{ppsmat} class object.} +} +\value{ +List with selected parameters. +} +\description{ +\code{control_inf} constructs a list with all necessary control parameters +for statistical inference. +} +\seealso{ +\code{\link[=nonprob]{nonprob()}} -- for fitting procedure with non-probability samples. +} diff --git a/man/controlOut.Rd b/man/control_out.Rd similarity index 96% rename from man/controlOut.Rd rename to man/control_out.Rd index 4230429..fe15d7f 100644 --- a/man/controlOut.Rd +++ b/man/control_out.Rd @@ -1,10 +1,10 @@ % Generated by roxygen2: do not edit by hand % Please edit documentation in R/control_outcome.R -\name{controlOut} -\alias{controlOut} +\name{control_out} +\alias{control_out} \title{Control parameters for outcome model} \usage{ -controlOut( +control_out( epsilon = 1e-04, maxit = 100, trace = FALSE, @@ -77,7 +77,7 @@ this control list will be chosen as starting point.} List with selected parameters. } \description{ -\code{controlOut} constructs a list with all necessary control parameters +\code{control_out} constructs a list with all necessary control parameters for outcome model. } \seealso{ diff --git a/man/controlSel.Rd b/man/control_sel.Rd similarity index 93% rename from man/controlSel.Rd rename to man/control_sel.Rd index f151a26..8de7095 100644 --- a/man/controlSel.Rd +++ b/man/control_sel.Rd @@ -1,10 +1,10 @@ % Generated by roxygen2: do not edit by hand % Please edit documentation in R/control_selection.R -\name{controlSel} -\alias{controlSel} +\name{control_sel} +\alias{control_sel} \title{Control parameters for selection model} \usage{ -controlSel( +control_sel( method = "glm.fit", epsilon = 1e-04, maxit = 500, @@ -60,7 +60,7 @@ controlSel( \item if \code{2} then \mjseqn{ \mathbf{h}\left(\mathbf{x}, \boldsymbol{\theta}\right) = \mathbf{x}} }} -\item{penalty}{The penanlization function used during variables selection.} +\item{penalty}{The penalization function used during variables selection.} \item{a_SCAD}{The tuning parameter of the SCAD penalty for selection model. Default is 3.7.} @@ -95,13 +95,9 @@ controlSel( List with selected parameters. } \description{ -\code{controlSel} constructs a list with all necessary control parameters +\code{control_sel} constructs a list with all necessary control parameters for selection model. } \seealso{ \code{\link[=nonprob]{nonprob()}} -- for fitting procedure with non-probability samples. } -\author{ -Łukasz Chrostowski, Maciej Beręsewicz -\loadmathjax -} diff --git a/man/jvs.Rd b/man/jvs.Rd index 8966df1..d3f4ebe 100644 --- a/man/jvs.Rd +++ b/man/jvs.Rd @@ -20,7 +20,7 @@ A single data.frame with 6,523 rows and 6 columns jvs } \description{ -This is a subset of the subset of the Job Vacancy Survey from Poland (for one quarter). +This is a subset of the Job Vacancy Survey from Poland (for one quarter). The data has been subject to slight manipulation, but the relationships in the data have been preserved. For further details on the JVS, please refer to the following link: \url{https://stat.gov.pl/obszary-tematyczne/rynek-pracy/popyt-na-prace/zeszyt-metodologiczny-popyt-na-prace,3,1.html}. diff --git a/man/logit_model_nonprobsvy.Rd b/man/logit_model_nonprobsvy.Rd index c9f0a92..0a3248c 100644 --- a/man/logit_model_nonprobsvy.Rd +++ b/man/logit_model_nonprobsvy.Rd @@ -21,3 +21,4 @@ List with selected methods/objects/functions. \author{ Łukasz Chrostowski, Maciej Beręsewicz } +\keyword{internal} diff --git a/man/nonprob.Rd b/man/nonprob.Rd index 358e7d0..8d75d51 100644 --- a/man/nonprob.Rd +++ b/man/nonprob.Rd @@ -2,7 +2,7 @@ % Please edit documentation in R/main_function_documentation.R, R/nonprob.R \name{nonprob} \alias{nonprob} -\title{Inference with the non-probability survey samples} +\title{Inference with non-probability survey samples} \usage{ nonprob( data, @@ -20,9 +20,9 @@ nonprob( strata = NULL, weights = NULL, na_action = NULL, - control_selection = controlSel(), - control_outcome = controlOut(), - control_inference = controlInf(), + control_selection = control_sel(), + control_outcome = control_out(), + control_inference = control_inf(), start_selection = NULL, start_outcome = NULL, verbose = FALSE, @@ -33,41 +33,41 @@ nonprob( ) } \arguments{ -\item{data}{\code{data.frame} with data from the non-probability sample.} +\item{data}{a \code{data.frame} with data from the non-probability sample.} -\item{selection}{\code{formula}, the selection (propensity) equation.} +\item{selection}{a \code{formula}, the selection (propensity) equation.} -\item{outcome}{\code{formula}, the outcome equation.} +\item{outcome}{a \code{formula}, the outcome equation.} -\item{target}{\code{formula} with target variables.} +\item{target}{a \code{formula} with target variables.} -\item{svydesign}{an optional \code{svydesign} object (from the survey package) containing probability sample and design weights.} +\item{svydesign}{an optional \code{svydesign} object (from the survey package) containing a probability sample and design weights.} \item{pop_totals}{an optional \verb{named vector} with population totals of the covariates.} \item{pop_means}{an optional \verb{named vector} with population means of the covariates.} -\item{pop_size}{an optional \code{double} with population size.} +\item{pop_size}{an optional \code{double} value with population size.} -\item{method_selection}{a \code{character} with method for propensity scores estimation.} +\item{method_selection}{a \code{character} indicating the method for propensity scores estimation.} -\item{method_outcome}{a \code{character} with method for response variable estimation.} +\item{method_outcome}{a \code{character} indicating the method for response variable estimation.} -\item{family_outcome}{a \code{character} string describing the error distribution and link function to be used in the model. Default is "gaussian". Currently supports: gaussian with identity link, poisson and binomial.} +\item{family_outcome}{a \code{character} string describing the error distribution and the link function to be used in the model, set to \code{gaussian} by default. Currently supports: gaussian with identity link, poisson and binomial.} -\item{subset}{an optional \code{vector} specifying a subset of observations to be used in the fitting process.} +\item{subset}{an optional \code{vector} specifying a subset of observations to be used in the fitting process - not yet supported.} -\item{strata}{an optional \code{vector} specifying strata.} +\item{strata}{an optional \code{vector} specifying strata - not yet supported.} \item{weights}{an optional \code{vector} of prior weights to be used in the fitting process. Should be NULL or a numeric vector. It is assumed that this vector contains frequency or analytic weights.} -\item{na_action}{a function which indicates what should happen when the data contain \code{NAs}.} +\item{na_action}{a function which indicates what should happen when the data contain \code{NAs} - not yet supported.} -\item{control_selection}{a \code{list} indicating parameters to use in fitting selection model for propensity scores.} +\item{control_selection}{a \code{list} indicating parameters to be used when fitting the selection model for propensity scores.} -\item{control_outcome}{a \code{list} indicating parameters to use in fitting model for outcome variable.} +\item{control_outcome}{a \code{list} indicating parameters to be used when fitting the model for the outcome variable.} -\item{control_inference}{a \code{list} indicating parameters to use in inference based on probability and non-probability samples, contains parameters such as estimation method or variance method.} +\item{control_inference}{a \code{list} indicating parameters to be used for inference based on probability and non-probability samples, contains parameters such as the estimation method or the variance method.} \item{start_selection}{an optional \code{vector} with starting values for the parameters of the selection equation.} @@ -75,9 +75,9 @@ nonprob( \item{verbose}{verbose, numeric.} -\item{x}{Logical value indicating whether to return model matrix of covariates as a part of output.} +\item{x}{a logical value indicating whether to return model matrix of covariates as a part of the output.} -\item{y}{Logical value indicating whether to return vector of outcome variable as a part of output.} +\item{y}{a logical value indicating whether to return vector of the outcome variable as a part of the output.} \item{se}{Logical value indicating whether to calculate and return standard error of estimated mean.} @@ -103,7 +103,7 @@ with type \code{list} containing:\cr \item{\code{pop_size} -- estimated population size derived from estimated weights (non-probability sample) or known design weights (probability sample).} \item{\code{pop_totals} -- the total values of the auxiliary variables derived from a probability sample or vector of total/mean values.} \item{\code{outcome} -- list containing information about the fitting of the mass imputation model, in the case of regression model the object containing the list returned by -\code{\link[stats:glm]{stats::glm()}}, in the case of the nearest neighbour imputation the object containing list returned by \code{\link[RANN:nn2]{RANN::nn2()}}. If \code{bias_correction} in \code{\link[=controlInf]{controlInf()}} is set to \code{TRUE}, the estimation is based on +\code{\link[stats:glm]{stats::glm()}}, in the case of the nearest neighbour imputation the object containing list returned by \code{\link[RANN:nn2]{RANN::nn2()}}. If \code{bias_correction} in \code{\link[=control_inf]{control_inf()}} is set to \code{TRUE}, the estimation is based on the joint estimating equations for the \code{selection} and \code{outcome} model and therefore, the list is different from the one returned by the \code{\link[stats:glm]{stats::glm()}} function and contains elements such as \itemize{ \item{\code{coefficients} -- estimated coefficients of the regression model.} @@ -153,11 +153,11 @@ when the propensity score model is fitting. Returned only if selection of variab } \item{\code{stat} -- matrix of the estimated population means in each bootstrap iteration. Returned only if a bootstrap method is used to estimate the variance and \code{keep_boot} in -\code{\link[=controlInf]{controlInf()}} is set on \code{TRUE}.} +\code{\link[=control_inf]{control_inf()}} is set on \code{TRUE}.} } } \description{ -\code{nonprob} fits model for inference based on non-probability surveys (including big data) using various methods. +\code{nonprob} fits a model for inference based on non-probability surveys (including big data) using various methods. The function allows you to estimate the population mean with access to a reference probability sample, as well as sums and means of covariates. The package implements state-of-the-art approaches recently proposed in the literature: Chen et al. (2020), @@ -166,7 +166,7 @@ Yang et al. (2020), Wu (2022) and uses the \href{https://CRAN.R-project.org/pack It provides propensity score weighting (e.g. with calibration constraints), mass imputation (e.g. nearest neighbour) and doubly robust estimators that take into account minimisation of the asymptotic bias of the population mean estimators or variable selection. -The package uses \code{survey} package functionality when a probability sample is available. +The package uses the \code{survey} package functionality when a probability sample is available. } \details{ Let \mjseqn{y} be the response variable for which we want to estimate the population mean, @@ -232,7 +232,7 @@ Notice that for \mjseqn{ \mathbf{h}\left(\mathbf{x}_i, \boldsymbol{\theta}\right Using the imputed values for the probability sample and the (known) design weights, we can build a population mean estimator of the form: \mjsdeqn{\hat{\mu}_{MI} = \frac{1}{N^B}\sum_{i \in S_{B}} d_{i}^{B} \hat{y}_i.} -It opens the the door to a very flexible method for imputation models. The package uses generalized linear models from \code{\link[stats:glm]{stats::glm()}}, +It opens the door to a very flexible method for imputation models. The package uses generalized linear models from \code{\link[stats:glm]{stats::glm()}}, the nearest neighbour algorithm using \code{\link[RANN:nn2]{RANN::nn2()}} and predictive mean matching. \item Doubly robust estimation -- The IPW and MI estimators are sensitive to misspecified models for the propensity score and outcome variable, respectively. To this end, so-called doubly robust methods are presented that take these problems into account. @@ -281,7 +281,7 @@ penalized estimating functions as } where \mjseqn{\lambda_{\theta}} and \mjseqn{q_{\lambda_{\beta}}} are some smooth functions. We let \mjseqn{q_{\lambda} \left(x\right) = \frac{\partial p_{\lambda}}{\partial x}}, where \mjseqn{p_{\lambda}} is some penalization function. Details of penalization functions and techniques for solving this type of equation can be found \href{https://ncn-foreigners.github.io/nonprobsvy-book/variableselection.html}{here}. -To use the variable selection model, set the \code{vars_selection} parameter in the \code{\link[=controlInf]{controlInf()}} function to \code{TRUE}. In addition, in the other control functions such as \code{\link[=controlSel]{controlSel()}} and \code{\link[=controlOut]{controlOut()}} +To use the variable selection model, set the \code{vars_selection} parameter in the \code{\link[=control_inf]{control_inf()}} function to \code{TRUE}. In addition, in the other control functions such as \code{\link[=control_sel]{control_sel()}} and \code{\link[=control_out]{control_out()}} you can set parameters for the selection of the relevant variables, such as the number of folds during cross-validation algorithm or the lambda value for penalizations. Details can be found in the documentation of the control functions for \code{nonprob}. } @@ -393,11 +393,11 @@ estimation process of the bias minimization approach. \code{\link[RANN:nn2]{RANN::nn2()}} -- For more information about the nearest neighbour algorithm used during mass imputation process. -\code{\link[=controlSel]{controlSel()}} -- For the control parameters related to selection model. +\code{\link[=control_sel]{control_sel()}} -- For the control parameters related to selection model. -\code{\link[=controlOut]{controlOut()}} -- For the control parameters related to outcome model. +\code{\link[=control_out]{control_out()}} -- For the control parameters related to outcome model. -\code{\link[=controlInf]{controlInf()}} -- For the control parameters related to statistical inference. +\code{\link[=control_inf]{control_inf()}} -- For the control parameters related to statistical inference. } \author{ Łukasz Chrostowski, Maciej Beręsewicz diff --git a/man/pop.size.Rd b/man/pop_size.Rd similarity index 86% rename from man/pop.size.Rd rename to man/pop_size.Rd index ad80f5f..8cddf7f 100644 --- a/man/pop.size.Rd +++ b/man/pop_size.Rd @@ -1,10 +1,10 @@ % Generated by roxygen2: do not edit by hand % Please edit documentation in R/simple_methods.R -\name{pop.size} -\alias{pop.size} +\name{pop_size} +\alias{pop_size} \title{Estimate size of population} \usage{ -pop.size(object, ...) +pop_size(object, ...) } \arguments{ \item{object}{object returned by \code{nonprobsvy}.} diff --git a/man/probit_model_nonprobsvy.Rd b/man/probit_model_nonprobsvy.Rd index 9ff6553..89f176a 100644 --- a/man/probit_model_nonprobsvy.Rd +++ b/man/probit_model_nonprobsvy.Rd @@ -21,3 +21,4 @@ List with selected methods/objects/functions. \author{ Łukasz Chrostowski, Maciej Beręsewicz } +\keyword{internal}