Sum of Multiples, Dig Deepter: Possible Improvement?

chezwicker · April 29, 2024, 7:33pm

I’ve just looked at the “dig deeper” section for “sum of multiples” and I wondered if the current approach for " Calculating sum from factors" could possibly be improved - or if the following would actually make it worse?

    factors.iter()
        .filter(|&&factor| factor != 0)
        .flat_map(|&factor| (factor..limit).step_by(factor as usize))
        .collect::<HashSet<_>>().iter()
        .sum()

My assumption is that collecting into a HashSet should be faster than sorting and deduplicating, no? And the code is shorter and imho not less readable. Of course, I’m new to rust, so I don’t know if that’s not idiomatic for some reason.

Another version which uses a set all the way might be

    factors.iter()
        .filter(|&&factor| factor != 0)
        .fold(HashSet::<_>::new(), |mut agg, &curr| {
            (curr..limit).step_by(curr as usize).for_each(|f| { agg.insert(f); });
            agg
        })
        .into_iter().sum()

No idea if that would seem readable to a rustacean and if there’s a better way to express this ;-)

Happy to create a PR if any of this seems like a good idea.

IsaacG · April 29, 2024, 7:54pm

Running some benchmarks is a really good way to get some data about that!

senekor · April 29, 2024, 8:24pm

For reference, here’s the relevant discussion on the original PR.

An earlier version of the article included the statement that the vector version uses less memory, but this is the case only for small n. Since performance depends on input shape anyway, the performance discussion was simply removed.

I think the only information a benchmark could give us here is for which input size HashSet starts outperforming Vec. I don’t think that’s terribly interesting, since it depends on hardware anyway.

Your first version seems like the best to me. It is the most straight-forward, idiomatic approach without thinking about performance. And because we can’t say much with certainty about performance, that’s what I would prefer.

I don’t see the point of the second version, seems like the same as the first but with manual collecting?

chezwicker · April 29, 2024, 8:46pm

Thanks for your responses!

The idea of the second version was to use less space since duplicates are eliminated on the fly. But to be honest, I don’t know if collection happens once at the end or if this is a stream being processed. If it’s the latter, then the second version makes no sense at all ;-)

senekor · April 29, 2024, 9:11pm

It is indeed the latter. So you may feel at ease when using collect, there usually is no alternative, more efficient way to do it

To figure that out on our own, we would check the documentation of collect, which defers its implementation to FromIterator. This is a trait that allows specialized collecting-logic for each concrete collection type. Since we’re collecting into a HashSet, we can check its implementation of FromIterator. The source of from_iter calls a method extend, but we sadly don’t have a link to click on. We can type HashSet::extend in the search bar to find what we’re looking for. Wait, the source of extends just calls extends again? Confusing, but it turns out there are two implementation for the Extend trait, one for references and one for owned values. We need to look at the implementation for owned values. Aaand that just calls self.base.extends and at this point I’m giving up the goose chase!

I was trying to show how to find information in the std documentation and failed miserably

Anyway it should be intuitive that any concrete collection type can make arbitrary optimizations about how to be constructed from an iterator inside its FromIterator implementation. For example, if iterators give a size_hint, a vector will pre-allocate the precise amount of memory necessary to store all items of the iterator.

senekor · April 29, 2024, 9:14pm

I would welcome a PR with the first version.

@clechasseur what is your opinion?

clechasseur · April 29, 2024, 11:28pm

As you said, without benchmarks, I think the proposed solution is actually simpler, which is the only metric we have. So I’d say it would indeed be better for the Dig deeper section.

I’d recomment a slight formatting change to make it more in line with what rustfmt would recommend:

factors
    .iter()
    .filter(|&&factor| factor != 0)
    .flat_map(|&factor| (factor..limit).step_by(factor as usize))
    .collect::<HashSet<_>>()
    .iter()
    .sum()

Of course it’s the same code so feel free to reformat it if you don’t like my formatting

chezwicker · April 30, 2024, 6:20am

Happy to do it either way ;-)

For me (doing mostly Kotlin these days), the iter()-call feels like missing syntactic sugar: I’s expect to be able to call filter (or at least sum if there’s a difference on what can be added to primitives and traits) directly, so I inlined iter(): it doesn’t add any “logic” to the code.

Let me know which format you agree on and I’ll open the PR. Thanks for your time!

senekor · April 30, 2024, 8:05am

I’ll take whatever rustfmt is happy with ;-) Rust is still a sysprog language at its core, there is no culture of hiding low-level details like transformations between iterators and collections. Thank you!

chezwicker · April 30, 2024, 11:13am

All right, I’ll go with that one.

While editing, I’ve noticed that the explanation of when to choose which approach may not be correct any longer: As we eliminate the sorting / deduplication, are there really still cases when the other approach is faster? I have trouble thinking of anything where (a) it"s faster and (b) it matters (if we did the same micky-mouse cases over and over again, we would just use caching…). What are your thoughts?

senekor · April 30, 2024, 11:28am

I think the general point still stands. By collecting into a HashSet, the deduplication still happens in the form of computing and comparing hashes. So a slight rewording may be in order, e.g. “small number of multiples to deduplicate” → “small number of hashes to compute” etc.

I wouldn’t change anything in the last paragraph, which generally recommends the second approach. It makes it pretty clear that if you actually care about performance, you have to run your own benchmarks.

chezwicker · April 30, 2024, 11:51am

PR is here - let me know what you think!

mpg · April 30, 2024, 8:12pm

That was still informative. And also, I’m glad to know I’m not the only one finding the doc difficult to navigate at times. I mean, I really like Rust’s trait system, but it means you often have to jump through a couple indirections when looking for documentation. It was nice “seeing” you go through this process, thanks

senekor · April 30, 2024, 8:55pm

That’s what I was hoping

But your comment gave me a little motivation to continue the goose chase. HashSet’s field base seems to be a hashbrown::HashSet. For context, hashbrown is a really fast hashmap/-set library that came out some time ago. Eventually, the standard library implementation was just replaced with that. Digging further into the docs of hashbrown, we can find out that hashbrown::HashSet is a newtype around hashbrown::HashMap. The implementation of Extend for hashbrown::HashMap finally contains the actual code:

fn extend<T: IntoIterator<Item = (K, V)>>(&mut self, iter: T) {
    // Keys may be already present or show multiple times in the iterator.
    // Reserve the entire hint lower bound if the map is empty.
    // Otherwise reserve half the hint (rounded up), so the map
    // will only resize twice in the worst case.
    let iter = iter.into_iter();
    let reserve = if self.is_empty() {
        iter.size_hint().0
    } else {
        (iter.size_hint().0 + 1) / 2
    };
    self.reserve(reserve);
    iter.for_each(move |(k, v)| {
        self.insert(k, v);
    });
}

Just like with vectors, Iterator::size_hint is being used to pre-allocate some amount of memory.

Phew! It’s tough, but it can be done

I also just discovered that the source code on docs.rs is starting to contain links between items. Which is absolutely amazing, it’s staring to feel like navigating code in an IDE.