Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast path for vec_slice(x, <vector of only TRUE>)? #1400

Closed
mgirlich opened this issue Jun 22, 2021 · 4 comments
Closed

fast path for vec_slice(x, <vector of only TRUE>)? #1400

mgirlich opened this issue Jun 22, 2021 · 4 comments

Comments

@mgirlich
Copy link
Contributor

mgirlich commented Jun 22, 2021

I'm not sure how often this is actually relevant in real life but it might be worth to have a "fast path" in vec_slice() when the i is a logical vector where every element is TRUE.

vec_slice2 <- function(x, i) {
  if (is.logical(i) && all(i)) {
    return(x)
  }
  
  vec_slice(x, i)
}

This would avoid copying all of x and therefore be much faster and memory friendly. At first it might seem unlikely to have all TRUE but I can see two likely scenarios:

  • remove "edge case" values like NA, e.g. vec_slice(out, !vec_equal_na(vals)),
  • looping over medium sized elements of a list and filtering each element, e.g.
library(purrr)
library(vctrs)

map(
  list(1:10, 11:20, 21:30),
  ~ vec_slice(.x, .x <= 28)
)

Having a quick look at the source code the check might be relatively cheap to do in lgl_as_location() resp. in r_lgl_which().

@mgirlich mgirlich changed the title fast path for vec_slice(x, TRUE)? fast path for vec_slice(x, <vector of only TRUE>)? Jun 22, 2021
@DavisVaughan
Copy link
Member

Your second bullet point is similar to what list_compact() would do right?

@mgirlich
Copy link
Contributor Author

I added an example to the second bullet point. I really meant looping over a list and slicing every element of a list under some (rare) condition. Though, now that I think about it I'm not sure how much more efficient this is as the result is a new list anyway.

@lionel-
Copy link
Member

lionel- commented Jun 22, 2021

I think this makes sense, and the x[!is.na(x)] example is convincing.

Given the modularity of the internals, it might be best to implement this via the compact subscripts. The C-level vec_as_location() would return a compact sequence of the form 1:n that vec_slice() could then recognise to exit early. Since vec_as_location() is exposed on the R side, this would have to be implemented via ALTREP.

@lionel-
Copy link
Member

lionel- commented Oct 3, 2022

Implemented in #1408 but we decided to close.

@lionel- lionel- closed this as not planned Won't fix, can't repro, duplicate, stale Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants