8 KiB
+++ title = "Multi-Stage Docker container Build" author = ["Elia el Lazkani"] date = 2025-03-05 lastmod = 2025-03-05 tags = ["docker", "linux", "container", "multi-stage", "podman", "dockerfile"] categories = ["container"] draft = false +++
One of the hidden gems of Docker containers is multi-stage builds. If it never made any sense to you, you've heard of it but have no clue what it is or just passing along... We're going to use it in a practical example.
go-cmw
A while ago, I wrote a small utility in golang which fetches the weather for me and displays it in the terminal. go-cmw is, basically, a wttr.in terminal client. It simplifies the usage of the API and makes it easier to integrate, for me, into other terminal tools. Who's not a big a fan of the shell huh ! Am I right !
Let's containerize it
Let's say we would like to write a Dockerfile
for the project to build the
code and create a container for it.
The Dockerfile
would probably look something like this.
# Yes, we're smart, we used a small image because it's all we need
FROM docker.io/library/golang:alpine
# Copy the directory of the code into /cmw
ADD . /cmw
# Install git as a dependency
RUN apk add git && \
# Navigate to the directory where we copied the code to
cd /cmw && \
# Get the dependencies of the project
go get -u . && \
# Build that bad boy !
go build -o cmw && \
# Move it into a path we know is in $PATH
mv cmw /usr/bin/cmw && \
# Clean up, we have security in mind
cd / && rm -rf /cmw
# Aight run it over one day !
CMD ["cmw", "-o"]
We've tried to use as few layers as possible to keep this image small. Let's take a look at the image we built.
$ podman build -t cmw .
$ podman run -e GO_CMW_LOCATION="Dublin" cmw
Weather report: Dublin
\ / Partly cloudy
_ /"".-. +12(11) °C
\_( ). ↑ 16 km/h
/(___(__) 10 km
0.0 mm
┌──────────────────────────────┬───────────────────────┤ Wed 05 Mar ├───────────────────────┬──────────────────────────────┐
│ Morning │ Noon └──────┬──────┘ Evening │ Night │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│ \ / Sunny │ \ / Sunny │ \ / Sunny │ \ / Clear │
│ .-. +9(7) °C │ .-. +12(10) °C │ .-. +8(6) °C │ .-. +6(3) °C │
│ ― ( ) ― ↑ 14-20 km/h │ ― ( ) ― ↗ 19-22 km/h │ ― ( ) ― ↑ 14-29 km/h │ ― ( ) ― ↑ 14-29 km/h │
│ `-’ 10 km │ `-’ 10 km │ `-’ 10 km │ `-’ 10 km │
│ / \ 0.0 mm | 0% │ / \ 0.0 mm | 0% │ / \ 0.0 mm | 0% │ / \ 0.0 mm | 0% │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
Location: Dublin, County Dublin, Leinster, Ireland [53.3497645,-6.2602731]
Okay, it works. So what now ? Well now, we look deeper.
Let's look at the size...
$ podman images cmw
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/cmw latest db1730d690ed 2 minutes ago 398 MB
Ah! That's quite big for a small tiny client that shouldn't be bigger than a few MBs. This is going to take forever to download on a server to run constantly (hypothetically).
How many layers does it have...
$ podman inspect cmw | jq .[].RootFS.Layers[] | wc -l
7
That's quite a few layers. The more we have to download, the slower the download is.
And finally a quick vulnerability scan...
$ trivy image localhost/cmw
...
Okay the Trivy output is quite big but the summary is 2 Critical, 1 High and 2 Medium severity vulnerabilities all coming from the image even though we're using the latest available.
Multi-stage build
We saw a few issues in the previous part of this post, let's see if we can fix them.
The first thing I'm going to do is start thinking about my application. My client is written in golang which means that the binary should work without any dependencies. I could build the binary on my machine and then copy it to the container. This path will definitely reduce our layer number but this path is not easily packaged and reproduced on a different machine. Besides, we said we're using containers for this.
Another thing to think about are the vulnerabilities in the built image. All of the vulnerabilities identified are related to golang, which makes sense. We're using a golang container image after all, even though the image is based on the hardened alpine distribution. We can do better, we can go with a container that contains almost nothing, it should definitely be more secure.
FROM docker.io/library/golang:alpine as builder
ADD . /cmw
RUN apk add git && \
cd /cmw && \
go get -u . && \
go build -o cmw
FROM docker.io/library/alpine:latest
COPY --from=builder /cmw/cmw /cmw
CMD ["/cmw", "-o"]
Let me explain a bit what changed. The first change is that we named our first stage to builder to make it easier to reference it later. The dependency installations and the code builds stay exactly the same. The cleanups were removed as they have no purpose anymore.
The second FROM
is where the magic starts to happen. We're using, in this
second stage a plain alpine
image. This container does not have any golang
compiler, library or dependencies. We, then, COPY
the cmw
binary from
the builder container and into our alpine container. The rest does basically
the same.
Now, let's take a deeper look at the image.
$ podman images cmw
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/cmw latest 978342ca6735 8 minutes ago 19.5 MB
The difference, in size, between the old image and this new one is extremely
significant, down from 398 MB
to just 20 MB
.
And the layers...
$ podman inspect cmw | jq .[].RootFS.Layers[] | wc -l
2
Only 2
, all the way down from 7
.
And finally, the icing on the cake...
$ trivy image localhost/cmw
localhost/cmw (alpine 3.21.3)
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)
That's right, no vulnerabilities at all.
warning
There are no vulnerabilities at the time of building this image. This does not mean that this image will stay this way. Over time, vulnerabilities will eventually be found. This is the reason why it is advisable to rebuild your images frequently to keep them updated.
Conclusion
Docker
container multi-stage build is not a hard concept to grasp. As you
can see, it helps a lot in creating a small and safe container. More stages can
be built on top, some to build the frontend written in TypeScript for example.
This opens up a wide range of features and opportunities for us to use to our advantage.