Implement vec_recode_values()
and vec_replace_values()
#2027
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pairs with #2024
Big set of benchmarks for future us to refer back to.
Note that the intention is not really to compete with
replace(x, x == 1L, NA)
in terms of direct performance, because these functions can do so much more than that. (remember that internally we are doingvec_match(x, 1L)
instead, to account for >1table
size too). But it is at least interesting to compare against them, and I do think that many people will probably use this to one-off recode problematic values likereplace_values(x, 1 ~ NA)
, so it's worth making it pretty fast there.Also, the intention is not really to complete with
to[match(x, from)]
either, even though this is roughly whatrecode_values(x, from ~ to)
does. I've included benchmarks against that orcase_match()
as it is again interesting to compare against them, and we are often competitive with or much better than the base R approach (and remember we can take >1from
andto
values).The massive benefits really kick in one you start having >1
from
vector and >1to
vector. Like:Then you really get a huge reduction in memory usage compared to typical
case_match()
or base R (like 8.5gb down to 1gb in some cases)