Skip to content

Consider .sort = FALSE for summarise(), reframe(), and slice_sample() #6663

@DavisVaughan

Description

@DavisVaughan

With the introduction of .by, we no longer sort group keys automatically. There are a whole host of good reasons for this as outlined here #5664 (comment), and I am mostly confident this is the right long term default for dplyr.

However, I am empathetic to the fact that users do often like to see their summary results sorted in ascending order. Right now, our recommendation is:

df %>%
  summarise(..., .by = c(a, b, c)) %>%
  arrange(a, b, c) # could also come before `summarise()`

This is nice because you get the full power of arrange() including desc() and .locale.

I think we should consider a .sort argument like:

df %>%
  summarise(..., .by = c(a, b, c), .sort = TRUE)
  • .sort = FALSE would be the default for reasons mentioned above.
  • We'd document this as the 100% backwards compatible way to transition from group_by() to .by (even though most of the time the ordering isn't important).
  • You must accept that you get ascending order and the C locale. That makes it compatible with group_by(). If you need anything fancier, call arrange().
  • I do like that you won't have to repeat the group names.
  • Obviously .sort = TRUE errors on unorderable types like clock's year-month-weekday.
  • This would probably only be an argument for the .data.frame method, as opposed to the generic, because dbplyr probably won't want to enforce a sort order? Uncertain.

Basically, this leaves the idea of a groupby + summarise operation theoretically pure (because it shouldn't require orderable keys), but also gives users a convenient way to optionally opt in to sorted results.


There are 3 functions that would get this argument:

The following would not get .sort because they aren't about row ordering:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions