During building services, we often need to build docker images. We do it multiple times a day. It can be a time-consuming task. Locally we only notice it a little, but in CI/CD pipelines, it can be a problem.
In this post, I will show you how to speed up the process. I will show you how to use a cache, layer your Dockerfile, and use multi-stage builds, to make your builds faster.
For this, I’ll use a simple Go application. You can use any other application you have. It does not matter which stack, language, or framework you use. The principles are the same.
Everything I do is executed on my local machine. I do not use any CI/CD tools. I use Docker Desktop for Mac.
Cleaning Up
Just to be sure that we are starting from a clean state, we can remove all unused images, containers, volumes, and networks:
$ docker system prune -a
WARNING! This will remove:
- all stopped containers
- all networks not used by at least one container
- all images without at least one container associated to them
- all build cache
Are you sure you want to continue? [y/N] y
...gone with the wind...
Starting Point
I started with a simple Dockerfile(Dockerfile_1):
FROM golang:buster
WORKDIR /app
COPY app /app/
ENTRYPOINT [ "/app/app" ]
To be able to use this Dockerfile, I have to build an application first:
$ go build -o app
And then build the image:
$ docker build . -f Dockerfile_1
Sending build context to Docker daemon 22.84MB
Step 1/4 : FROM golang:buster
---> f8c6c6bf3e26
Step 2/4 : WORKDIR /app
---> Running in 62eb8791ace1
Removing intermediate container 62eb8791ace1
---> d586151d2813
Step 3/4 : COPY app /app/
---> 25b4f091cba7
Step 4/4 : ENTRYPOINT [ "/app/app" ]
---> Running in 7853090f8c3b
Removing intermediate container 7853090f8c3b
---> 0e3d3835a61b
Successfully built 0e3d3835a61b
I want to start it, but I need to know the image name. I can use docker images
to find it:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 0e3d3835a61b 48 seconds ago 739MB
excalidraw/excalidraw latest d6392f9c5191 2 days ago 34.8MB
golang buster f8c6c6bf3e26 4 days ago 720MB
moby/buildkit buildx-stable-1 4dc9f4d5bf89 2 weeks ago 168MB
slimdotai/dd-ext 0.8.2 56f11b815b6c 7 months ago 153MB
I can see that the image name is <none>
. I can use it to start the container:
$ docker run 0e3d3835a61b
exec /app/app: exec format error
What happens? Go back to the Dockerfile_1 and look at it. There are several issues with it:
- I’m building the application for OSX, but I want to run it in Linux.
- I do not specify which Go version I’m using. Locally I can have Go 1.16, but the image has the latest Go version(atm it is 1.20).
- my application uses port 9999, but I do not expose it.
- my image does not have a name and version.
Multi-stage Builds
To fix the first issue, I can use multi-stage builds. I’ll create a new Dockerfile(Dockerfile_2):
ARG GO_VERSION=1.20.3
FROM golang:${GO_VERSION}-buster as builder
WORKDIR /app
COPY . /app/
RUN go mod tidy
RUN go build -o app
FROM debian:buster as final
WORKDIR /app
COPY --from=builder /app/app /app/
EXPOSE ${PORT:-9999}
ENTRYPOINT [ "/app/app" ]
In the new Dockerfile, I’m addressing the Go version with the ARG
instruction. You do not have to do it this way. You can hardcode
the version as well. But with ARG
, you can override it when you build the image.
Building an app is moved to the first or builder
stage. When the application is built, it is copied to the second stage or the final
stage.
In both stages, I’m using Debian Buster. It is a small image, and it is enough for my app.
As well I expose a port, setting the default value to 9999.
Now I can build the image:
$ docker build . -t rnemet/echo:0.0.1 -f Dockerfile_2
Sending build context to Docker daemon 22.84MB
Step 1/11 : ARG GO_VERSION=1.20.3
Step 2/11 : FROM golang:${GO_VERSION}-buster as builder
1.20.3-buster: Pulling from library/golang
Digest: sha256:413cd9e04db86fee3f5c667de293f37d9199b74880771c37dcfeb165cefaf424
Status: Downloaded newer image for golang:1.20.3-buster
---> f8c6c6bf3e26
Step 3/11 : WORKDIR /app
---> Using cache
---> d586151d2813
Step 4/11 : COPY . /app/
---> 331d288c0f19
Step 5/11 : RUN go mod tidy
---> Running in 2657122aa7fe
go: downloading github.com/prometheus/client_golang v1.14.0
...snip...
go: downloading github.com/rogpeppe/go-internal v1.8.0
Removing intermediate container 2657122aa7fe
---> 48197d27f8ab
Step 6/11 : RUN go build -o app
---> Running in 7e593ea7ffb4
Removing intermediate container 7e593ea7ffb4
---> d086687f4f17
Step 7/11 : FROM debian:buster
buster: Pulling from library/debian
4e2befb7f5d1: Already exists
Digest: sha256:235f2a778fbc0d668c66afa9fd5f1efabab94c1d6588779ea4e221e1496f89da
Status: Downloaded newer image for debian:buster
---> 4591634d6289
Step 8/11 : WORKDIR /app
---> Running in a79e19ed4815
Removing intermediate container a79e19ed4815
---> b316081e2c13
Step 9/11 : COPY --from=builder /app/app /app/
---> 6fdc4f84223f
Step 10/11 : EXPOSE ${PORT:-9999}
---> Running in e5bf1bc188b9
Removing intermediate container e5bf1bc188b9
---> 8da39c1270c4
Step 11/11 : ENTRYPOINT [ "/app/app" ]
---> Running in 421008b145ee
Removing intermediate container 421008b145ee
---> 159ca8b29354
Successfully built 159ca8b29354
Successfully tagged rnemet/echo:0.0.1
Now I can see that image has a name and version:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
rnemet/echo 0.0.1 159ca8b29354 4 minutes ago 133MB
<none> <none> d086687f4f17 4 minutes ago 1.17GB
<none> <none> 0e3d3835a61b 40 minutes ago 739MB
excalidraw/excalidraw latest d6392f9c5191 2 days ago 34.8MB
golang 1.20.3-buster f8c6c6bf3e26 5 days ago 720MB
golang buster f8c6c6bf3e26 5 days ago 720MB
moby/buildkit buildx-stable-1 4dc9f4d5bf89 2 weeks ago 168MB
debian buster 4591634d6289 2 weeks ago 114MB
slimdotai/dd-ext 0.8.2 56f11b815b6c 7 months ago 153MB
And I can run the container:
$ docker run rnemet/echo:0.0.1
2021/12/05 20:56:05 Starting server on port 9999
If you want to override the Go version, you can do it like this:
$ docker build . -t rnemet/echo:0.0.1 -f Dockerfile_2 --build-arg GO_VERSION=1.16.10
Layering and Cache
Look again at Dockerfile_2. Each entry in the Dockerfile creates a new layer, and each layer is cached. If you change something in the Dockerfile, Docker will rebuild the changed layer and all the following layers.
Look at the output of the docker build
command:
Sending build context to Docker daemon 22.84MB
Step 1/11 : ARG GO_VERSION=1.20.3
Step 2/11 : FROM golang:${GO_VERSION}-buster as builder
1.20.3-buster: Pulling from library/golang
Digest: sha256:413cd9e04db86fee3f5c667de293f37d9199b74880771c37dcfeb165cefaf424
Status: Downloaded newer image for golang:1.20.3-buster
---> f8c6c6bf3e26
Step 3/11 : WORKDIR /app
---> Using cache <=== here cache is used
---> d586151d2813
Step 4/11 : COPY . /app/
---> 331d288c0f19
Step 5/11 : RUN go mod tidy
---> Running in 2657122aa7fe
go: downloading github.com/prometheus/client_golang v1.14.0
...snip...
go: downloading github.com/rogpeppe/go-internal v1.8.0
Removing intermediate container 2657122aa7fe
---> 48197d27f8ab
My goal is to write layers that are mostly the same. That way, I can use the cache and build images faster. In step 4, I copy all files from my local directory to the image. At first glance, it does have sense. But if I change a README file, or any other file which is not related to the application, I will rebuild the whole image. That is not good. So, I either specify what to copy or what not to copy.
For the second option, I can use the .dockerignore
file. It is similar to the .gitignore
file. It contains a list of files
that should not be copied to the image:
.gitignore
.dockerignore
**/compose*
Dockerfile
License
Makefile
Readme.md
Then COPY . /app/
will copy only files, not in the .dockerignore
file.
Let us consider one more thing. In step 5, I’m running go mod tidy
. It downloads all dependencies. Those dependencies
are not changed often. When they are changed, I should rebuild the app. Downloading dependencies is not a big issue for Go apps,
but for other languages, it can be(think about NodeJS). So, let’s first handle dependencies and then copy the source code. This way, I’m
using a cache for dependencies and not rebuilding them every time I change the source code.
ARG GO_VERSION=1.20.3
FROM golang:${GO_VERSION}-buster as builder
WORKDIR /app
COPY go.mod go.sum /app/
RUN go mod download -x
COPY . /app/
RUN go build -o app
FROM debian:buster
WORKDIR /app
COPY --from=builder /app/app /app/
EXPOSE ${PORT:-9999}
ENTRYPOINT [ "/app/app" ]
When initially running docker build . -t rnemet/echo:0.0.1 -f Dockerfile_3
it will take some time to download dependencies.
Because I’m using option -x
, I can see ALL downloaded dependencies. You can remove the -x
option if it bothers you.
If you rerun it, it will be much faster. And as well you’ll notice that dependencies are cached.
If you change the source code, dependencies will not be downloaded again. So building the image will be much faster.
Try it yourself. Compare build times for Dockerfile_2 and Dockerfile_3.
Remote Cache
When using CI/CD, you either rely on CI/CD cache implementation or remote cache. The remote cache is a cache that is stored in a remote location, and as such, you can use it to speed up your builds, share between different machines and different users.
For this, I had to use BuildKit. It is a new builder toolkit for Docker. You can use it like this:
docker buildx build -t rnemet/echo:0.0.1 . -f Dockerfile_3 --cache-to type=registry,ref=rnemet/echo:test --cache-from type=registry,ref=rnemet/echo:test --cache-from type=registry,ref=rnemet/echo:main [--push|--load]
If you want to use a remote cache, specify --cache-to
and --cache-from
options. Option --cache-to
specifies
where to store the cache. Option --cache-from
specifies where to get the cache from. You can specify multiple locations
for both options. If you specify multiple locations for --cache-from
, it will try to get the cache from all of them. If
it finds the cache in one of them, it will use it.
A good practice is to create a cache for branches and main. In the above example, I have branches test
and main
. I’m using
the test
branch for testing and main
for production. So, I’m creating a cache for both branches. If I’m building a test
branch,
it will try to get the cache from the test
branch, and if it fails, it will try to get the cache from the main
branch.
If you want to push the image to a registry, use the --push
option. If you’re going to load the image to your local machine,
you can use the --load
option.
Conclusion
In this article, I showed you how to construct Dockerfile to speed up the build process. I hope you found it helpful.