Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lab 05a_sampling_distributions: replicates filtered out in sample_props_small #107

Open
mamcisaac opened this issue Oct 31, 2022 · 1 comment · May be fixed by #108
Open

Lab 05a_sampling_distributions: replicates filtered out in sample_props_small #107

mamcisaac opened this issue Oct 31, 2022 · 1 comment · May be fixed by #108

Comments

@mamcisaac
Copy link

sample_props_small often has fewer than the requested 25 elements.

The call to
filter(scientist_work == "Doesn't benefit")
is filtering out any replicates where there are no "Doesn't benefit"s in the small sample. As a result any replicates with p_hat=0 are filtered out and are not displayed.

This issue is caused by using a small sample size and a true proportion close to 0 (p=.2).

@mamcisaac mamcisaac changed the title Lab 05a_sampling_distributions: Lab 05a_sampling_distributions: replicates filtered out in sample_props_small Oct 31, 2022
@mamcisaac
Copy link
Author

mamcisaac commented Oct 31, 2022

The code throughout should be like the following to avoid this edge case of filtering out times when p_hat=0:

sample_prop_small <- global_monitor %>%

                    rep_sample_n(size = 10, reps = 25, replace = TRUE) %>%

                    group_by(replicate)%>%

                    summarize(p_hat = mean(scientist_work=="Doesn't benefit"))

mamcisaac added a commit to mamcisaac/oilabs-tidy that referenced this issue Oct 31, 2022
The call to
filter(scientist_work == "Doesn't benefit")
is filtering out any replicates where there are no "Doesn't benefit"s in the small sample. As a result any replicates with p_hat=0 are filtered out and are not displayed.

This issue is caused by using a small sample size and a true proportion close to 0 (p=.2).

I have replaced this filtering code with the following

 group_by(replicate)%>% summarize(p_hat = mean(scientist_work=="Doesn't benefit"))

Fixes OpenIntroStat#107.
mamcisaac added a commit to mamcisaac/oilabs-tidy that referenced this issue Oct 31, 2022
Replaces code based on filtering (which breaks down in the edge case where teh sample proporiton is 0, since there is then nothing to filter on) with code based on group_by + summarize.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant