Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Growable ALTREP vectors #6697

Draft
wants to merge 7 commits into
base: truehash
Choose a base branch
from
Draft

Growable ALTREP vectors #6697

wants to merge 7 commits into from

Conversation

aitap
Copy link
Contributor

@aitap aitap commented Dec 29, 2024

Ongoing work towards #6180. Depends on #6694. Detailed motivation in a pending blog post.

Good news:

* checking compiled code ... NOTE
File ‘data.table/libs/data_table.so’:
  Found non-API call to R: ‘LEVELS’ # <-- and that's it!

Bad news:

Test 2289.2 ran without errors but failed check that x equals y:
> x = dt[, b, a] 
   a b  [Key= Types=int,lis Classes=int,lis]
1: 1 1                                      
2: 2 2                                      
> y = dt 
   a b  [Key= Types=int,exp Classes=int,exp]
1: 1 1                                      
2: 2 2                                      
Datasets have different column modes. First 3: b(list!=expression)

Unfortunately, since there is no altexpression ALTREP class, best I was able to support EXPRSXP columns was by creating VECSXP lists for their contents.

How long do we have and does anyone have a plan B?

The resulting data.tables now have GROWABLE_BIT set, therefore:
 - the finalizer is not needed on R >= 3.4
 - duplicates of data.tables (which are not over-allocated) now have
   TRUELENGTH of 0 instead of whatever it was before, which is detected
   earlier in selfrefok()

As a result, assign.c only uses TRUELENGTH on R < 3.4.
Since dogroups() relies on being able to shrink vectors it hasn't
allocated, introduce make_growable() and is_growable() to adapt. Since
dogroups() relies on SD and SDall having shared columns, use the
setgrowable() wrapper on the R side at the time when SD and SDall are
being created.

(In the ALTREP case, setgrowable() will re-create the columns.)
This helps avoid false positive warnings about unreachable places in the code.
Currently broken:
- 1 errors out of 11637
 - won't work with expression vectors at all
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant