1GB container image for clippy?

For a while, I’ve thought about running clippy in the rust-analyzer to give automated feedback to students. I think that would be awesome, because clippy doesn’t just complain. It gives very detailed and educational explanations of why something is better done differently.

The current implementation of the rust-analyzer is nice, in that it allows to make quite sophisticated custom suggestions for improving some exercise. However, that is basically the idea of the rust-representer: Mentors being able to give feedback on many solutions that look similar at once. In the rust-analyzer, it takes quite a bit of custom Rust code to add just one suggestion. As far as I can tell, there hasn’t been any significant development on the analyzer for years and the existing lints are very basic and only apply to easy exercises. (clock, gigasecond, reverse-string) It seems to me whatever was planned for the analyzer turned out not to be sustainable.

clippy would give lots of great suggestions on all exercises, but there’s one problem: It builds on top of the rest of the Rust toolchain (rustc, cargo). That means including clippy in the rust-analyzer container image would increase its size by about 1GB.

If it was possible to merge the test-runner and the analyzer in a single container image, this problem would be solved. But I’m assuming that’s not easily doable on the infra side.

Would that size increase be acceptable or simply too much for an analyzer? @ErikSchierboom @iHiD

Ehm, it’s quite a lot! We’ll discuss this internally!

So the Rust compiler + toolchain + cargo equals 1GB? Is that correct?

Yeah, I expect it to be similar to the rust-test-runner image. I just took a look over there again. The lion’s share really is rustc, there’s no getting around that. That image is 902MB in total. It also includes the local-registry, which the analyzer wouldn’t, but that’s only about 50MB at this point. The rust-test-runner doesn’t include clippy, but looking at my local install, that seems to be only 15MB on top of the rest of the toolchain.

I’m guessing total image size should be pretty similar to the test runner, so around 900MB.

Okay go for it

Context Tests fail with no output · Issue #1761 · exercism/rust · GitHub plus whether the entire registry should be compile and thus cached in the image to avoid being hit by the 3 second test time.
I would be curious to know the thinking about the image size, why it matters and where the bottleneck is. E.g. where would it hurt (cost, time to start etc.), if we had to “explode” it to 2 GB? My thinking is that caching would be so good in AWS that is would not matter much.
/Per

I can’t rewatch the whole stream to confirm, but I think this was touched upon in this stream on the exercism infra. For some reason the fancy AWS lambdas cannot be used and regular servers are used instead. (Forgive any mistakes, I’m not an infra guy.) The normal servers need to have enough storage space allocated for the container images of the tooling of each track. There are some 60 languages, imagine each having a 1GB image for test-runner, representer and analyzer. That would add up to 180GB → expensive. So there is good reason for every language track to be considerate about the size of their tooling images and optimize it as much as possible.

180GB per server. ~20 servers = 3600GB = $thousands per year.

So, after a long time, I started working on this a bit. When I got to the point where I wanted to understand the output format spec precisely, I noticed that the interface doesn’t seem to be intended for dynamically generated comments (e.g. from clippy).

As a workaround, I suppose it should work to have a single comment in the website-copy repo that’s just a single template parameter and then use that to inject the entire message from clippy.

Is that a reasonable way forward?

It totally is!

1 Like

PR for reference

IT WORKS !!! :heart_eyes::heart_eyes::heart_eyes:

5 Likes

Yay! Awesome work.

2 Likes

Nice work!

1 Like