Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for counting all objects in a collection #243

Open
amomchilov opened this issue Oct 26, 2024 · 2 comments
Open

Add API for counting all objects in a collection #243

amomchilov opened this issue Oct 26, 2024 · 2 comments

Comments

@amomchilov
Copy link
Contributor

Goal

It pretty common to want to take a collection of items, and count the number of occurrences of each item. e.g.

let input = ["a", "b", "c", "b", "a"]

let desiredOutput = ["a": 2, "b": 2, "c": 1] 

Today

There's 2 relatively short ways to achieve this today:

  1. Using reduce: input.reduce(into: [:]) { $0[default: 0] += 1 }

    Reduce is really general, and isn't particularly readable, especially for beginners. The performance here is good though, allocating a single dictionary and mutating it in-place.

  2. Using group(by:): group(by: { $0 }).mapValues(\.count)

    We could use the group(by:) helper that I added to Swift Algorithms, but it allocates a ton of intermediate arrays for all the groups, when all we need is their counts.

Proposed solution

The exact name is TBD, but I'm proposing a function like:

extension Sequence where Element: Hashable {
    func tallied() -> [Element: Int] {
        return reduce(into: [:]) { $0[default: 0] += 1 }
    }
}

We could also consider taking a by: parameter, to count things by a value other than themselves. Though perhaps .lazy.map would be better. E.g. input.tallied(by: \.foo) could be expressed as input.lazy.map(\.foo).tallied()

Alternatives

A more general "collectors" API

Similar to Java collectors, which let you express transformations over streams, collecting into Arrays, Dictionaries, Counters, or anything else you might like.

This could pair well with Swift Collections, e.g. if we added a new CountedSet (a native Swift alternative to NSCountedSet. E.g. we could have:

input.grouping(by: \.foo, collectingInto: { CountedSet() })

Prior art

Language Name
Python collections.Counter
Ruby tally
Java java.util.stream.Collectors.counting()
JavaScript (Lodash) countBy

C#, Rust don't have helpers for this.

@xwu
Copy link
Contributor

xwu commented Oct 27, 2024

Would this API be just a different spelling for a CountedSet initializer, or would there be meaningful differences?

@amomchilov
Copy link
Contributor Author

@xwu Oh cool, I didn't notice there was already an issue for a counted set.

Even if it did exist, I think we should expose a method on Sequence for producing it. For the same reason why input.grouped(by: ...) is nicer than Dictionary(grouping: input, by: ...) (see rationale and PR). In that case, that would make more sense to live in Swift Collections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants