Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stochastic tutorial added #185

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

weisscharlesj
Copy link

Added tutorial-stochastic-simulation.ipynb tutorial which uses the NumPy random number generator to simulation various processes and solve problems. This tutorial was proposed in issue #184.

Added tutorial-stochastic-simulation.ipynb file which contains a tutorial demonstrating using NumPy random number generators to stochastically simulate a variety of processes. This tutorial was proposed in issue numpy#184.
Added tutorial-stochastic-simulations to README file
README.md Outdated Show resolved Hide resolved
Co-authored-by: Mukulika <[email protected]>
@bsipocz
Copy link
Member

bsipocz commented Jun 28, 2023

Hi @weisscharlesj 👋 Thank you for the PR.

Before diving into trying to sort out the build issues, I wonder whether you're willing to agree for us to use this PR as a platform for updating our contributing guide. I've noticed a few things that are missing from the How-to-contribute, and some of those are causing the issues with the build/tests here.
This is all independent from the content of your tutorial, and I'm happy to help or do these infrastructural changes to the PR.

@weisscharlesj
Copy link
Author

@bsipocz

You are welcome to use this PR for whatever you need to including updates to the contribution guide.

@numpy numpy deleted a comment from review-notebook-app bot Jul 31, 2023
@rossbar rossbar added the content Issues relevant to tutorial content label Jul 31, 2023
Copy link
Collaborator

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your submission @weisscharlesj . Sorry for the slow response, it's been a busy summer so far!

I've taken the liberty of making a couple structural changes to get this into a reviewable state, namely:

  • Converted the .ipynb to a myst-markdown notebook (a necessary step for review)
  • Added the tutorial to the features toctree to fix the build errors and make the tutorial accessible on the site.

My goal in pushing these up is to get over the red-x on CI and make this reviewable - I haven't (yet) modified the content itself in any way!

I'll aim for a review of the tutorial itself ASAP. If you want to make any changes in the meantime, be sure to git pull first!

You may have guessed by looking at the returned values that `random()` produces
values in the 0 $\rightarrow$ 1 range, but what happens if we need values in a
different range?
We can modify these values by mutiplying them by a coefficient to increase the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an opportunity to recommend using the uniform() method. It's the idiomatic way to accomplish this.


+++

## Calculating pi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would omit this example entirely. I understand why it's included in various treatments of Monte Carlo approximation methods, but I think that everything that makes it an accessible example are exactly the reasons why one shouldn't use Monte Carlo techniques with pseudorandomness. Of course, no one actually needs to calculate pi to this rough level of accuracy at all, but even the kinds of practical problems that look enough like this one should be solved with other techniques, like Quasi-Monte Carlo, if not straight-up numerical integration.

I think we're on much firmer ground to use PRNGs when we are simulating actual stochastic processes or evaluating probability puzzles like in the other examples. The difference is that in this example, the property of the sequence that we're looking for is just (asymptotic) uniformity. PRNG sequences have that, but other sequences, like those from QMC techniques or even just grids, have that much better. The other examples also rely on independence, which PRNGs have (for practical purposes) and QMC sequences don't.

the circle a radius = 1.
This requires the coordinates to fall in the [-1, 1) ranges along both the $x$-
and $y$-axes.
We have no random number generator that produces values in this range, but we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be careful with these claims. We do indeed have a method for exactly this purpose: uniform().

undecayed_array = np.full(t_final + 1, n)

for second in range(1, t_final + 1):
decays = rng.binomial(1, p=k, size=n_undecayed).sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly did you want to show here? decays is binomial-distributed, but the idiomatic way to compute this with the binomial() method would be:

decays = rng.binomial(n_undecayed, p=k)

If you wanted to show how the binomial is constructed out of a sum of Bernoulli trials, you can do that, but I think it's confusing to use binomial() to make Bernoulli trials only to sum them up. The idiom for getting Bernoulli trials with a probability of k, and then summing up the successes looks like this:

decays = (rng.random(n_undecayed) < k).sum()

all_unique_class = 0 # number of classrooms with students NOT sharing birthdays

for classroom in range(n_classrooms):
birthdays = rng.integers(0, high=365, size=class_size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this tutorial is an opportunity to demonstrate good practice for passing PRNG state to functions that consume pseudorandomness. Namely, each function should take an rng=None argument and execute rng = np.random.default_rng(rng) before calling any methods. @albertcthomas has a good article on this, though we are converging on using rng as the name for the argument instead of seed.

I think that's more critical information for writing stochastic simulations than details about any particular Generator method call.

integers.
If we changed the number of layers to an odd number, we'd only get odd positions
in the result, and if we used +1/2 and -1/2 for our horizontal movement, we'd get
both even and odd integers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can probably omit a lot of this explanation if you just left the values as 0s and 1s and just say that 0 means the left path was taken and 1 means the right path was taken.

@Mukulikaa
Copy link
Contributor

Hi, @weisscharlesj. Just a gentle ping to see if you have had the time to address the comments on the tutorial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Issues relevant to tutorial content
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants