Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: unexpected error from COO.from_numpy when using idx_dtype kwarg #810

Open
2 of 3 tasks
Ganar-lab opened this issue Nov 18, 2024 · 3 comments
Open
2 of 3 tasks

Comments

@Ganar-lab
Copy link

sparse version checks

  • I checked that this issue has not been reported before list of issues.

  • I have confirmed this bug exists on the latest version of sparse.

  • I have confirmed this bug exists on the main branch of sparse.

Describe the bug

If you want to obtain create a muldimensional sparse array with idx_dtype=np.uint8 from a numpy array whose size is larger than 256, but whose max(shape) is smaller, you get an unexpected error.

Steps or code to reproduce the bug

This code

import numpy as np
import sparse

x = np.empty((25, 25))  # idem for x = np.zeros((25, 25))
idx_dtype = np.uint8
assert max(x.shape) < 256
sparse.COO.from_numpy(x, idx_dtype=idx_dtype)

Expected results

I would have expected no error, and an output with the correct shape and the correct idx_dtype.

Actual results

{
	"name": "ValueError",
	"message": "cannot cast array with shape (625,) to dtype <class 'numpy.uint8'>.",
	"stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[17], line 6
      4 x = np.empty((25, 25))
      5 idx_dtype = np.uint8
----> 6 sparse.COO.from_numpy(x, idx_dtype=idx_dtype)
      7 x.shape, x.size

File /usr/local/lib/python3.11/site-packages/sparse/_coo/core.py:400, in COO.from_numpy(cls, x, fill_value, idx_dtype)
    398 coords = np.atleast_2d(np.flatnonzero(~equivalent(x, fill_value)))
    399 data = x.ravel()[tuple(coords)]
--> 400 return cls(
    401     coords,
    402     data,
    403     shape=x.size,
    404     has_duplicates=False,
    405     sorted=True,
    406     fill_value=fill_value,
    407     idx_dtype=idx_dtype,
    408 ).reshape(x.shape)

File /usr/local/lib/python3.11/site-packages/sparse/_coo/core.py:272, in COO.__init__(self, coords, data, shape, has_duplicates, sorted, prune, cache, fill_value, idx_dtype)
    270 if idx_dtype:
    271     if not can_store(idx_dtype, max(shape)):
--> 272         raise ValueError(
    273             \"cannot cast array with shape {} to dtype {}.\".format(
    274                 shape, idx_dtype
    275             )
    276         )
    277     self.coords = self.coords.astype(idx_dtype)
    279 if self.shape:

ValueError: cannot cast array with shape (625,) to dtype <class 'numpy.uint8'>."
}

Please describe your system.

  1. OS and version: Codespace from Github (Linux)
  2. sparse version 0.14.0
  3. NumPy version 1.23.5
  4. Numba version: not used

Relevant log output

No response

@Ganar-lab Ganar-lab added bug Indicates an unexpected problem or unintended behavior needs triage Issue has not been confirmed nor labeled labels Nov 18, 2024
@hameerabbasi
Copy link
Collaborator

A couple of notes:

  • sparse 0.15.4 is released on PyPI. Numba is installed alongside sparse
  • You need the product of the shape to fit inside the index dtype for many operations, not just the shape elements.

@hameerabbasi hameerabbasi added wontfix Indicates that work won't continue on an issue, pull request, or discussion and removed bug Indicates an unexpected problem or unintended behavior needs triage Issue has not been confirmed nor labeled labels Nov 19, 2024
@Ganar-lab
Copy link
Author

Hi, thank you for your quick answer. I understand now why this is enforced. It might be worth adding a comment in the documentation of sparse.COO (where actually the parameter idx_type is missing).

Note that we can nonetheless change the dtype of the coords attribute after initialization - useful for storing only purposes.

@hameerabbasi hameerabbasi added documentation and removed wontfix Indicates that work won't continue on an issue, pull request, or discussion labels Nov 20, 2024
@hameerabbasi
Copy link
Collaborator

I'd be happy to accept a PR for a documentation update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants