+++ title = "Multi-Stage Docker container Build" author = ["Elia el Lazkani"] date = 2025-03-05 lastmod = 2025-03-05 tags = ["docker", "linux", "container", "multi-stage", "podman", "dockerfile"] categories = ["container"] draft = false +++ One of the hidden gems of _Docker containers_ is _multi-stage_ builds. If it never made any sense to you, you've heard of it but have no clue what it is or just passing along... We're going to use it in a practical example. ## go-cmw {#go-cmw} A while ago, I wrote a small utility in _golang_ which fetches the weather for me and displays it in the terminal. _[go-cmw](https://scm.project42.io/elia/go-cmw)_ is, basically, a [wttr.in](https://wttr.in/) terminal client. It simplifies the usage of the **API** and makes it easier to integrate, for me, into other terminal tools. Who's not a big a fan of the shell huh ! Am I right ! ## Let's containerize it {#let-s-containerize-it} Let's say we would like to write a `Dockerfile` for the project to build the code and create a container for it. The `Dockerfile` would probably look something like this. ```dockerfile # Yes, we're smart, we used a small image because it's all we need FROM docker.io/library/golang:alpine # Copy the directory of the code into /cmw ADD . /cmw # Install git as a dependency RUN apk add git && \ # Navigate to the directory where we copied the code to cd /cmw && \ # Get the dependencies of the project go get -u . && \ # Build that bad boy ! go build -o cmw && \ # Move it into a path we know is in $PATH mv cmw /usr/bin/cmw && \ # Clean up, we have security in mind cd / && rm -rf /cmw # Aight run it over one day ! CMD ["cmw", "-o"] ``` We've tried to use as few layers as possible to keep this image small. Let's take a look at the image we built. ```shell $ podman build -t cmw . $ podman run -e GO_CMW_LOCATION="Dublin" cmw Weather report: Dublin \ / Partly cloudy _ /"".-. +12(11) °C \_( ). ↑ 16 km/h /(___(__) 10 km 0.0 mm ┌──────────────────────────────┬───────────────────────┤ Wed 05 Mar ├───────────────────────┬──────────────────────────────┐ │ Morning │ Noon └──────┬──────┘ Evening │ Night │ ├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤ │ \ / Sunny │ \ / Sunny │ \ / Sunny │ \ / Clear │ │ .-. +9(7) °C │ .-. +12(10) °C │ .-. +8(6) °C │ .-. +6(3) °C │ │ ― ( ) ― ↑ 14-20 km/h │ ― ( ) ― ↗ 19-22 km/h │ ― ( ) ― ↑ 14-29 km/h │ ― ( ) ― ↑ 14-29 km/h │ │ `-’ 10 km │ `-’ 10 km │ `-’ 10 km │ `-’ 10 km │ │ / \ 0.0 mm | 0% │ / \ 0.0 mm | 0% │ / \ 0.0 mm | 0% │ / \ 0.0 mm | 0% │ └──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘ Location: Dublin, County Dublin, Leinster, Ireland [53.3497645,-6.2602731] ``` Okay, it works. So what now ? Well now, we look deeper. Let's look at the size... ```shell $ podman images cmw REPOSITORY TAG IMAGE ID CREATED SIZE localhost/cmw latest db1730d690ed 2 minutes ago 398 MB ``` Ah! That's quite big for a small tiny client that shouldn't be bigger than a few MBs. This is going to take forever to download on a server to run constantly (hypothetically). How many layers does it have... ```shell $ podman inspect cmw | jq .[].RootFS.Layers[] | wc -l 7 ``` That's quite a few layers. The more we have to download, the slower the download is. And finally a quick vulnerability scan... ```shell $ trivy image localhost/cmw ... ``` Okay the _Trivy_ output is quite big but the summary is **2 Critical**, **1 High** and **2 Medium** severity vulnerabilities all coming from the image even though we're using the latest available. ## Multi-stage build {#multi-stage-build} We saw a few issues in the previous part of this post, let's see if we can fix them. The first thing I'm going to do is start thinking about my application. My client is written in _golang_ which means that the binary should work without any dependencies. I could build the binary on my machine and _then_ **copy** it to the container. This path will definitely reduce our layer number but this path is not easily packaged and reproduced on a different machine. Besides, we said we're using _containers_ for this. Another thing to think about are the vulnerabilities in the built image. All of the vulnerabilities identified are related to _golang_, which makes sense. We're using a _golang_ container image after all, even though the image is based on the hardened _alpine_ distribution. We can do better, we can go with a container that contains almost nothing, it should definitely be more secure. ```dockerfile FROM docker.io/library/golang:alpine as builder ADD . /cmw RUN apk add git && \ cd /cmw && \ go get -u . && \ go build -o cmw FROM docker.io/library/alpine:latest COPY --from=builder /cmw/cmw /cmw CMD ["/cmw", "-o"] ``` Let me explain a bit what changed. The first change is that we named our _first stage_ to **builder** to make it easier to reference it later. The _dependency_ installations and the code builds stay exactly the same. The cleanups were removed as they have no purpose anymore. The _second_ `FROM` is where the magic starts to happen. We're using, in this _second stage_ a plain `alpine` image. This container does not have any _golang_ compiler, library or dependencies. We, _then_, `COPY` the `cmw` _binary_ from the **builder** container and into our _alpine_ container. The rest does basically the same. Now, let's take a deeper look at the image. ```shell $ podman images cmw REPOSITORY TAG IMAGE ID CREATED SIZE localhost/cmw latest 978342ca6735 8 minutes ago 19.5 MB ``` The difference, in size, between the old image and this new one is **extremely significant**, down from `398 MB` to _just_ `20 MB`. And the layers... ```shell $ podman inspect cmw | jq .[].RootFS.Layers[] | wc -l 2 ``` Only `2`, all the way down from `7`. And finally, the icing on the cake... ```shell $ trivy image localhost/cmw localhost/cmw (alpine 3.21.3) Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0) ``` That's right, no vulnerabilities at all.
warning
There are no vulnerabilities at the time of building this image. This does not mean that this image will stay this way. Over time, vulnerabilities will eventually be found. This is the reason why it is advisable to rebuild your images frequently to keep them updated.