Docker image sizes

Hey everybody! We’re working on reducing the image sizes of our Docker images. The first step is identifying the existing sizes. Here is the list of images ordered descendingly by size (largest first):

Image Size
exercism/haskell-test-runner 6560MB
exercism/swift-test-runner 2290MB
exercism/rust-test-runner 2070MB
exercism/ocaml-test-runner 1750MB
exercism/fsharp-test-runner 1220MB
exercism/scala-test-runner 1200MB
exercism/perl5-test-runner 1030MB
exercism/go-test-runner 835MB
exercism/r-test-runner 803MB
exercism/dart-test-runner 770MB
exercism/julia-test-runner 765MB
exercism/clojurescript-test-runner 742MB
exercism/vbnet-test-runner 739MB
exercism/pony-test-runner 713MB
exercism/pharo-smalltalk-test-runner 706MB
exercism/cfml-test-runner 674MB
exercism/vlang-test-runner 660MB
exercism/java-test-runner 596MB
exercism/d-test-runner 561MB
exercism/ballerina-test-runner 561MB
exercism/php-test-runner 546MB
exercism/kotlin-test-runner 535MB
exercism/emacs-lisp-test-runner 508MB
exercism/cpp-test-runner 506MB
exercism/purescript-test-runner 504MB
exercism/groovy-test-runner 495MB
exercism/crystal-test-runner 493MB
exercism/crystal-representer 486MB
exercism/sml-test-runner 476MB
exercism/abap-test-runner 458MB
exercism/vimscript-test-runner 433MB
exercism/cobol-test-runner 433MB
exercism/java-representer 431MB
exercism/typescript-test-runner 417MB
exercism/scheme-test-runner 416MB
exercism/mips-test-runner 410MB
exercism/clojure-analyzer 401MB
exercism/reasonml-test-runner 396MB
exercism/javascript-test-runner 393MB
exercism/wasm-test-runner 350MB
exercism/unison-test-runner 342MB
exercism/ruby-analyzer 319MB
exercism/ruby-test-runner 317MB
exercism/zig-test-runner 314MB
exercism/ruby-representer 314MB
exercism/bash-analyzer 278MB
exercism/lua-test-runner 266MB
exercism/go-analyzer 262MB
exercism/javascript-representer 258MB
exercism/elm-analyzer 257MB
exercism/typescript-analyzer 252MB
exercism/javascript-analyzer 248MB
exercism/typescript-representer 244MB
exercism/csharp-test-runner 241MB
exercism/raku-test-runner 236MB
exercism/powershell-test-runner 236MB
exercism/elixir-test-runner 226MB
exercism/tcl-test-runner 224MB
exercism/elixir-analyzer 223MB
exercism/elm-test-runner 222MB
exercism/elixir-representer 220MB
exercism/fortran-test-runner 216MB
exercism/prolog-test-runner 214MB
exercism/red-test-runner 210MB
exercism/wren-representer 209MB
exercism/c-test-runner 195MB
exercism/elm-representer 185MB
exercism/python-analyzer 162MB
exercism/x86-64-assembly-test-runner 161MB
exercism/racket-test-runner 146MB
exercism/python-test-runner 145MB
exercism/python-representer 134MB
exercism/coffeescript-test-runner 134MB
exercism/j-test-runner 111MB
exercism/j-representer 111MB
exercism/gleam-test-runner 108MB
exercism/8th-test-runner 104MB
exercism/csharp-analyzer 103MB
exercism/clojure-representer 101MB
exercism/clojure-test-runner 100MB
exercism/php-representer 98.5MB
exercism/haxe-test-runner 96.8MB
exercism/bash-test-runner 96.7MB
exercism/fsharp-representer 93.9MB
exercism/awk-test-runner 93.8MB
exercism/jq-test-runner 89.5MB
exercism/java-analyzer 89.2MB
exercism/lfe-test-runner 85.3MB
exercism/wren-test-runner 61.7MB
exercism/erlang-analyzer 59.4MB
exercism/erlang-test-runner 59.3MB
exercism/common-lisp-test-runner 50.9MB
exercism/common-lisp-representer 49MB
exercism/common-lisp-analyzer 49MB
exercism/csharp-representer 40.4MB
exercism/nim-test-runner 29.5MB
exercism/rust-analyzer 13.4MB
exercism/rust-representer 12.7MB

If you have idea for how to reduce a certain image’s size, please let me know (so we don’t do any duplicate work).

I’m working on the PHP test runner BTW

Edit: managed to reduce the image to 16% of its original size: Reduce image size by ErikSchierboom · Pull Request #81 · exercism/php-test-runner · GitHub

2 Likes

The pr to the coffeescript test runner pushed today should reduce the storage required about around 30mb.

I also have a patch for the Crystal representer which will reduce the storage required.

For Python, I am tempted to use the Alpine Image which would save 40mb on average per tooling image, but I’m still worried about the issues raised here.

The way I test that is by using time in run.sh. So for the Python test runner, I’d do:

- python3 bin/run.py "$@"
+ time python3 bin/run.py "$@"

And then use ./bin/run-in-docker on a Python solution.

VB test runner also shrunk to about 33% Reduce image size by ErikSchierboom · Pull Request #35 · exercism/vbnet-test-runner · GitHub

BTW If you’re on a *Nix system, you can use the following command from within the root of your tooling repo to build the Docker image and then print its size:

TAG=exercism/$(basename $PWD) docker build --quiet --tag $TAG . &> /dev/null && docker 
images $TAG --format '{{.Size}}'
1 Like

People might also want to try dive, which shows where the image size comes from.

4 Likes

To be able to properly compare measurements, what was the command you used to check image size?

This command seems as if changes decreased the size a lot, even though I am not really a friend of using the alpine based erlang image. On the other hand side, the most common situation in which it causes problems is FFI, which again we do not use on exercism.

I think I will create a PR with my current state.

$ docker image inspect erlang_test_runner:common_test | jq '.[].Size' | numfmt --to-unit=Mi --format='%.2f MiB'
56.60 MiB

I originally ran:

docker image ls --format "{{.Repository}}:{{.Tag}} {{.Size}}" | awk '{if ($2~/GB/) print substr($2, 1, length($2)-2) * 1000 "MB - " $1 ; else print $2 " - " $1 }' | sed '/^0/d' | sort -n | sed -E -e 's/:latest//'

However, docker images $TAG --format '{{.Size}} should work fine locally.

1 Like

Thats formatting with a thousand as base for the prefixes… So ± rounding errors I am in the same range as that command.

Having a look at the Rust image, it contains two toolchains, stable and nightly. Since nightly is needed anyway for json test output, we might get rid of the stable toolchain. That should bring it down from 2GB to around 1.5 - it’s a start. I’ll open an issue for myself on the repo.

Thanks @ee7-1282 for the tip, dive is a handy tool!

I would suggest taking a look at the libraries. At the moment the rust docker image has like 600-700 MB of just libraries. Many of them are not useful for solving exercises. I suggest making a whitelist of libraries instead of downloading libraries from a github repo that lists development crates (which has things like git, XML parsing, json parsing, etc…)

1 Like

I agree with the suggestion by @Meatball. Let’s try to come up with a whitelist of crates.

1 Like

Thanks, I noted it in my issue as well.

I think a good idea may be to make a forum post and take suggestions and just checking that the libary may resably be used within an exercise and then add it to the list.

There already is a whitelist: https://github.com/exercism/rust-test-runner/blob/d74d3a58667f37bdf9d63691d61a4f5110daff85/local-registry/supported_crates

Yeah but I mean the storage uses of these libraries are quite big, and they were just taken by taking the most popular crates on a website.

The list includes template engines, uuid libraries, database libraries, git libraries, libraries for parsing images and working with them, benchmarking tools, and other various types of “development” libaries.

My point is that someone makes a curated list of libraries that actually makes sense.

1 Like

Yeah, at a glance, things like rusqlite could probably go and save us some $$$ :slight_smile:

In the PR that introduced that file it is reported that the top 100 packages together make for 17 MB; the file specifies 232 packages, so assuming the trend continues that would make for only 40 MB.

It seems implausible that the length of that file is behind the Rust image being so big.