Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weights not working in datasummary_balance #840

Closed
kleuveld opened this issue Nov 29, 2024 · 3 comments
Closed

Weights not working in datasummary_balance #840

kleuveld opened this issue Nov 29, 2024 · 3 comments

Comments

@kleuveld
Copy link

datasummary_balance's help file states: "If data includes columns named "blocks", "clusters", or "weights", this information will be taken into account automatically by estimatr::difference_in_means."

However, this doesn't seem to be the case:

set.seed(1)
library(randomizr)
library(estimatr)
library(dplyr)
library(modelsummary)


dat <- tibble(x = rnorm(n = 1000),
       clusters = simple_ra(N = 1000,num_arms = 50),
       Z = cluster_ra(clusters = clusters),
       y = x + Z * 5,
       weights = if_else(y > 5,0,1)) # making sure the weights actually make a difference


# reports means of 0 and 5, and a difference of 5
datasummary_balance(y ~ Z, data = dat, output = "flextable")


# hoewer, the weighted mean of the treatment group is 4.2
weighted.mean(dat$y[dat$Z == 1], dat$weights[dat$Z == 1])


# and these should be the differences
difference_in_means(y ~ Z, 
                    clusters = clusters,
                    weights = weights,
                    data = dat)


# this is what datasummary_balance reports
difference_in_means(y ~ Z, 
                    data = dat)

I don't think the clusters are applied correctly either, but I couldn't reproduce that.

Session info:
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8 LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.utf8

time zone: Europe/Amsterdam
tzcode source: internal

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] modelsummary_2.2.0.4 dplyr_1.1.4 estimatr_1.0.4 randomizr_1.0.0

loaded via a namespace (and not attached):
[1] utf8_1.2.4 generics_0.1.3 fontLiberation_0.1.0 xml2_1.3.6 httpcode_0.3.0
[6] digest_0.6.35 magrittr_2.0.3 evaluate_1.0.1 grid_4.4.1 estimability_1.5.1
[11] mvtnorm_1.3-1 flextable_0.9.6 fastmap_1.2.0 jsonlite_1.8.9 zip_2.3.1
[16] backports_1.5.0 crul_1.5.0.91 Formula_1.2-5 promises_1.3.0 fansi_1.0.6
[21] box_1.2.0 fontBitstreamVera_0.1.1 textshaping_0.4.0 cli_3.6.2 shiny_1.9.1
[26] rlang_1.1.3 fontquiver_0.2.1 crayon_1.5.3 gfonts_0.2.0 gdtools_0.3.7
[31] officer_0.6.6 tools_4.4.1 uuid_1.2-0 checkmate_2.3.2 httpuv_1.6.15
[36] curl_5.2.1 vctrs_0.6.5 R6_2.5.1 mime_0.12 lifecycle_1.0.4
[41] emmeans_1.10.5 ragg_1.3.2 insight_1.0.0.1 pkgconfig_2.0.3 pillar_1.9.0
[46] later_1.3.2 data.table_1.15.4 glue_1.8.0 Rcpp_1.0.12 systemfonts_1.1.0
[51] xfun_0.45 tibble_3.2.1 tidyselect_1.2.1 rstudioapi_0.17.1 knitr_1.48
[56] xtable_1.8-4 htmltools_0.5.8.1 tables_0.9.28 rmarkdown_2.28 compiler_4.4.1
[61] askpass_1.2.1 openssl_2.2.0

@vincentarelbundock
Copy link
Owner

Thanks a lot for the detailed report @kleuveld. I really appreciate that you took the time to craft a reproducible example.

I can replicate and confirm the problem on my computer.

Life is kind of crazy right now, but I plan to prep a new modelsummary release by the end of the month, and expect to look into a fix for this.

@kleuveld
Copy link
Author

kleuveld commented Dec 4, 2024

Hi,

Don't worry, even with a bug report that's waaaaay more elaborate than this, modelsummary would have still saved me so much time! Thanks for the good work!

Perhaps the quickest fix is simply removing mention of weights in the help file?

vincentarelbundock added a commit that referenced this issue Dec 30, 2024
issue #840 blocks, clusters, and weights in datasummary_balance()
@vincentarelbundock
Copy link
Owner

Thanks again for the report!

This should be fixed in the development version on Github, so you can install and use it now. I plan to release a new version to CRAN before the end of January.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants