Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't inline map the full array when printing a slice #1162

Open
manopapad opened this issue Dec 17, 2024 · 0 comments
Open

Don't inline map the full array when printing a slice #1162

manopapad opened this issue Dec 17, 2024 · 0 comments

Comments

@manopapad
Copy link
Contributor

As @glentner points out, the first thing an ipython/jupyter user is going to do is:

import cupynumeric as np
a = np.ones((100_000, 100_000))
b = np.ones((100_000, 100_000))
c = a @ b
c[:10,:10]  # print out some elements

Currently this would cause the entire global array to be inline-mapped (gathered) onto each node's system memory, before it gets printed out, possibly causing OOM along the way.

If we were to inline-map subsets of the full array, then we would run the risk of having multiple overlapping read-write views over the same data, whose coherence we would have to manage explicitly.

A saving grace for this case is that printing (i.e. calling __str__) only requires a (temporary) read-only view over the (sub-)array, so we could (a) do a one-shot inline mapping of a sub-region (requires a Legate extension), (b) wrap the PhysicalStore (i.e. the memory buffer returned by the inline mapping) with a NumPy array, (c) print out the NumPy array, (d) immediately unmap. We should only do this in the deferred case (in the eager case the array is already fully in python memory), and only if we're not printing the full array (no reason to do a copy if we're gonna need to inline-map the full array anyway).

Some caveats:

  • We don't actually know for sure this was the issue @glentner ran into in his experiments, would like to confirm that e.g. printing out c.max() works, and also what, if any, the error output was for the original c[:10,:10].
  • This won't do anything if the user just tries to print the whole array, so we should avoid pulling the whole array in __str__ if it's too big -- selectively pull (through a series of partial one-shot mappings) a subset of elements and only print those out, similar to what NumPy does when printing out a large array.
  • This won't do anything if the user (indirectly) asks to inline-map the whole array for a reason other than printing, e.g. to pass to h5py. We probably want to have some warning in this case, and suggest to the user to use some alternative API that doesn't need to pull the entire array into one address space. This API doesn't always exist, but it does for (some) HDF5 functionality.
@manopapad manopapad assigned bryevdv and unassigned bryevdv Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants