What does `compressed_axes = None` mean? #535

hameerabbasi · 2022-01-06T09:18:41Z

I was going through the GCXS code trying to fix bugs when I came across a few places where compressed_axes was being set to None. That, in my opinion, should never happen. It should always be a tuple of integers.

@daletovar Could you specify the context here, and what could be needed to remove this?

The text was updated successfully, but these errors were encountered:

daletovar · 2022-01-06T21:49:04Z

There are a few cases where compressed_axes is set to None. In the case of 0- or 1-dimensional GCXS array, there are no axes to compress so we set compressed_axes to None. In the constructor and many of the array methods like resize and reshape, compressed_axes defaults to None. This is a way for the user to indicate that they don't have a preference. This does not mean that the resulting array will have arr.compressed_axes is None. Instead, we generally do something like the following:

if compressed_axes is None:
    compressed_axes = (np.argmin(shape),) # use compressed_axes that results in best compression

I think that is makes sense to have a default value for the cases that users don't have a preference. I used None for that value.

On the one hand I don't think it's really an issue because any array with ndim > 1 is going to have a compressed_axes that is a tuple of integers:

>>> a = sparse.random((5,6,7), format='gcxs', compressed_axes=None) 
>>> a.compressed_axes
(0,)

On the other hand I can see how such behavior could be confusing. In terms of removing it, I think that it should stay for 0- and 1-dimensional arrays. For higher dimensional arrays, we could pick a different default value - something like compressed_axes='opt' to indicate compressed_axes with the optimal compression.

hameerabbasi · 2022-01-07T08:15:41Z

There are a few cases where compressed_axes is set to None. In the case of 0- or 1-dimensional GCXS array, there are no axes to compress so we set compressed_axes to None.

I think this may mislead people, and make it hard to get at the underlying format. Would compressed_axes in this case be () or (0,)? I think you mean there's very little choice in how things are stored, but that doesn't necessarily mean that there isn't a valid representation in terms of tuples of integers.

In the constructor and many of the array methods like resize and reshape, compressed_axes defaults to None. This is a way for the user to indicate that they don't have a preference. This does not mean that the resulting array will have arr.compressed_axes is None.

This seems reasonable to me, as long as there is a concrete compressed_axes specified in the underlying array.

I think that it should stay for 0- and 1-dimensional arrays.

As said above we could use either () or (0,) as appropriate.

For higher dimensional arrays, we could pick a different default value - something like compressed_axes='opt' to indicate compressed_axes with the optimal compression.

I think None is a perfectly reasonable value to indicate "I don't have a preference here."

daletovar · 2022-01-07T17:52:52Z

Okay, that makes sense to me. I like () for 0- and 1-dimensional arrays.

mangecoeur · 2022-11-20T22:52:50Z

Just bumped into this, there are cases e.g. in to_scipy_sparse that assume non-None compressed axes, but a GCXS array can still be constructed with compressed_axes=None

hameerabbasi added the discussion label Jan 6, 2022

hameerabbasi assigned daletovar Jan 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does `compressed_axes = None` mean? #535

What does `compressed_axes = None` mean? #535

hameerabbasi commented Jan 6, 2022

daletovar commented Jan 6, 2022

hameerabbasi commented Jan 7, 2022

daletovar commented Jan 7, 2022

mangecoeur commented Nov 20, 2022

What does compressed_axes = None mean? #535

What does compressed_axes = None mean? #535

Comments

hameerabbasi commented Jan 6, 2022

daletovar commented Jan 6, 2022

hameerabbasi commented Jan 7, 2022

daletovar commented Jan 7, 2022

mangecoeur commented Nov 20, 2022

What does `compressed_axes = None` mean? #535

What does `compressed_axes = None` mean? #535