During building services, we often need to build docker images. We do it multiple times a day. It can be a time-consuming task. Locally we only notice it a little, but in CI/CD pipelines, it can be a problem.

In this post, I will show you how to speed up the process. I will show you how to use a cache, layer your Dockerfile, and use multi-stage builds, to make your builds faster.

For this, I’ll use a simple Go application. You can use any other application you have. It does not matter which stack, language, or framework you use. The principles are the same.

Everything I do is executed on my local machine. I do not use any CI/CD tools. I use Docker Desktop for Mac.

Cleaning Up

Just to be sure that we are starting from a clean state, we can remove all unused images, containers, volumes, and networks:

$ docker system prune -a
WARNING! This will remove:
  - all stopped containers
  - all networks not used by at least one container
  - all images without at least one container associated to them
  - all build cache

Are you sure you want to continue? [y/N] y
...gone with the wind...

Starting Point

I started with a simple Dockerfile(Dockerfile_1):

FROM golang:buster

WORKDIR /app

COPY app /app/

ENTRYPOINT [ "/app/app" ]

To be able to use this Dockerfile, I have to build an application first:

$ go build -o app

And then build the image:

$ docker build  . -f Dockerfile_1
Sending build context to Docker daemon  22.84MB
Step 1/4 : FROM golang:buster
 ---> f8c6c6bf3e26
Step 2/4 : WORKDIR /app
 ---> Running in 62eb8791ace1
Removing intermediate container 62eb8791ace1
 ---> d586151d2813
Step 3/4 : COPY app /app/
 ---> 25b4f091cba7
Step 4/4 : ENTRYPOINT [ "/app/app" ]
 ---> Running in 7853090f8c3b
Removing intermediate container 7853090f8c3b
 ---> 0e3d3835a61b
Successfully built 0e3d3835a61b

I want to start it, but I need to know the image name. I can use docker images to find it:

$ docker images
REPOSITORY              TAG               IMAGE ID       CREATED          SIZE
<none>                  <none>            0e3d3835a61b   48 seconds ago   739MB
excalidraw/excalidraw   latest            d6392f9c5191   2 days ago       34.8MB
golang                  buster            f8c6c6bf3e26   4 days ago       720MB
moby/buildkit           buildx-stable-1   4dc9f4d5bf89   2 weeks ago      168MB
slimdotai/dd-ext        0.8.2             56f11b815b6c   7 months ago     153MB

I can see that the image name is <none>. I can use it to start the container:

$ docker run 0e3d3835a61b

exec /app/app: exec format error

What happens? Go back to the Dockerfile_1 and look at it. There are several issues with it:

  • I’m building the application for OSX, but I want to run it in Linux.
  • I do not specify which Go version I’m using. Locally I can have Go 1.16, but the image has the latest Go version(atm it is 1.20).
  • my application uses port 9999, but I do not expose it.
  • my image does not have a name and version.

Multi-stage Builds

To fix the first issue, I can use multi-stage builds. I’ll create a new Dockerfile(Dockerfile_2):

ARG GO_VERSION=1.20.3
FROM golang:${GO_VERSION}-buster as builder

WORKDIR /app
COPY . /app/

RUN go mod tidy
RUN go build -o app

FROM debian:buster as final

WORKDIR /app

COPY --from=builder /app/app /app/
EXPOSE ${PORT:-9999}

ENTRYPOINT [ "/app/app" ]

In the new Dockerfile, I’m addressing the Go version with the ARG instruction. You do not have to do it this way. You can hardcode the version as well. But with ARG, you can override it when you build the image.

Building an app is moved to the first or builder stage. When the application is built, it is copied to the second stage or the final stage. In both stages, I’m using Debian Buster. It is a small image, and it is enough for my app. As well I expose a port, setting the default value to 9999.

Now I can build the image:

$ docker build . -t rnemet/echo:0.0.1 -f Dockerfile_2

Sending build context to Docker daemon  22.84MB
Step 1/11 : ARG GO_VERSION=1.20.3
Step 2/11 : FROM golang:${GO_VERSION}-buster as builder
1.20.3-buster: Pulling from library/golang
Digest: sha256:413cd9e04db86fee3f5c667de293f37d9199b74880771c37dcfeb165cefaf424
Status: Downloaded newer image for golang:1.20.3-buster
 ---> f8c6c6bf3e26
Step 3/11 : WORKDIR /app
 ---> Using cache
 ---> d586151d2813
Step 4/11 : COPY . /app/
 ---> 331d288c0f19
Step 5/11 : RUN go mod tidy
 ---> Running in 2657122aa7fe
go: downloading github.com/prometheus/client_golang v1.14.0
...snip...
go: downloading github.com/rogpeppe/go-internal v1.8.0
Removing intermediate container 2657122aa7fe
 ---> 48197d27f8ab
Step 6/11 : RUN go build -o app
 ---> Running in 7e593ea7ffb4
Removing intermediate container 7e593ea7ffb4
 ---> d086687f4f17
Step 7/11 : FROM debian:buster
buster: Pulling from library/debian
4e2befb7f5d1: Already exists
Digest: sha256:235f2a778fbc0d668c66afa9fd5f1efabab94c1d6588779ea4e221e1496f89da
Status: Downloaded newer image for debian:buster
 ---> 4591634d6289
Step 8/11 : WORKDIR /app
 ---> Running in a79e19ed4815
Removing intermediate container a79e19ed4815
 ---> b316081e2c13
Step 9/11 : COPY --from=builder /app/app /app/
 ---> 6fdc4f84223f
Step 10/11 : EXPOSE ${PORT:-9999}
 ---> Running in e5bf1bc188b9
Removing intermediate container e5bf1bc188b9
 ---> 8da39c1270c4
Step 11/11 : ENTRYPOINT [ "/app/app" ]
 ---> Running in 421008b145ee
Removing intermediate container 421008b145ee
 ---> 159ca8b29354
Successfully built 159ca8b29354
Successfully tagged rnemet/echo:0.0.1

Now I can see that image has a name and version:

docker images
REPOSITORY              TAG               IMAGE ID       CREATED          SIZE
rnemet/echo             0.0.1             159ca8b29354   4 minutes ago    133MB
<none>                  <none>            d086687f4f17   4 minutes ago    1.17GB
<none>                  <none>            0e3d3835a61b   40 minutes ago   739MB
excalidraw/excalidraw   latest            d6392f9c5191   2 days ago       34.8MB
golang                  1.20.3-buster     f8c6c6bf3e26   5 days ago       720MB
golang                  buster            f8c6c6bf3e26   5 days ago       720MB
moby/buildkit           buildx-stable-1   4dc9f4d5bf89   2 weeks ago      168MB
debian                  buster            4591634d6289   2 weeks ago      114MB
slimdotai/dd-ext        0.8.2             56f11b815b6c   7 months ago     153MB

And I can run the container:

$ docker run rnemet/echo:0.0.1
2021/12/05 20:56:05 Starting server on port 9999

If you want to override the Go version, you can do it like this:

$ docker build . -t rnemet/echo:0.0.1 -f Dockerfile_2 --build-arg GO_VERSION=1.16.10

Layering and Cache

Look again at Dockerfile_2. Each entry in the Dockerfile creates a new layer, and each layer is cached. If you change something in the Dockerfile, Docker will rebuild the changed layer and all the following layers.

Look at the output of the docker build command:

Sending build context to Docker daemon  22.84MB
Step 1/11 : ARG GO_VERSION=1.20.3
Step 2/11 : FROM golang:${GO_VERSION}-buster as builder
1.20.3-buster: Pulling from library/golang
Digest: sha256:413cd9e04db86fee3f5c667de293f37d9199b74880771c37dcfeb165cefaf424
Status: Downloaded newer image for golang:1.20.3-buster
 ---> f8c6c6bf3e26
Step 3/11 : WORKDIR /app
 ---> Using cache                            <=== here cache is used
 ---> d586151d2813
Step 4/11 : COPY . /app/
 ---> 331d288c0f19
Step 5/11 : RUN go mod tidy
 ---> Running in 2657122aa7fe
go: downloading github.com/prometheus/client_golang v1.14.0
...snip...
go: downloading github.com/rogpeppe/go-internal v1.8.0
Removing intermediate container 2657122aa7fe
 ---> 48197d27f8ab

My goal is to write layers that are mostly the same. That way, I can use the cache and build images faster. In step 4, I copy all files from my local directory to the image. At first glance, it does have sense. But if I change a README file, or any other file which is not related to the application, I will rebuild the whole image. That is not good. So, I either specify what to copy or what not to copy.

For the second option, I can use the .dockerignore file. It is similar to the .gitignore file. It contains a list of files that should not be copied to the image:

.gitignore
.dockerignore
**/compose*
Dockerfile
License
Makefile
Readme.md

Then COPY . /app/ will copy only files, not in the .dockerignore file.

Let us consider one more thing. In step 5, I’m running go mod tidy. It downloads all dependencies. Those dependencies are not changed often. When they are changed, I should rebuild the app. Downloading dependencies is not a big issue for Go apps, but for other languages, it can be(think about NodeJS). So, let’s first handle dependencies and then copy the source code. This way, I’m using a cache for dependencies and not rebuilding them every time I change the source code.

ARG GO_VERSION=1.20.3
FROM golang:${GO_VERSION}-buster as builder

WORKDIR /app

COPY go.mod go.sum /app/
RUN go mod download -x

COPY . /app/
RUN go build -o app

FROM debian:buster

WORKDIR /app

COPY --from=builder /app/app /app/
EXPOSE ${PORT:-9999}

ENTRYPOINT [ "/app/app" ]

When initially running docker build . -t rnemet/echo:0.0.1 -f Dockerfile_3 it will take some time to download dependencies. Because I’m using option -x, I can see ALL downloaded dependencies. You can remove the -x option if it bothers you. If you rerun it, it will be much faster. And as well you’ll notice that dependencies are cached.

If you change the source code, dependencies will not be downloaded again. So building the image will be much faster.

Try it yourself. Compare build times for Dockerfile_2 and Dockerfile_3.

Remote Cache

When using CI/CD, you either rely on CI/CD cache implementation or remote cache. The remote cache is a cache that is stored in a remote location, and as such, you can use it to speed up your builds, share between different machines and different users.

For this, I had to use BuildKit. It is a new builder toolkit for Docker. You can use it like this:

docker buildx build -t rnemet/echo:0.0.1 . -f Dockerfile_3 --cache-to type=registry,ref=rnemet/echo:test --cache-from type=registry,ref=rnemet/echo:test --cache-from type=registry,ref=rnemet/echo:main [--push|--load]

If you want to use a remote cache, specify --cache-to and --cache-from options. Option --cache-to specifies where to store the cache. Option --cache-from specifies where to get the cache from. You can specify multiple locations for both options. If you specify multiple locations for --cache-from, it will try to get the cache from all of them. If it finds the cache in one of them, it will use it.

A good practice is to create a cache for branches and main. In the above example, I have branches test and main. I’m using the test branch for testing and main for production. So, I’m creating a cache for both branches. If I’m building a test branch, it will try to get the cache from the test branch, and if it fails, it will try to get the cache from the main branch.

If you want to push the image to a registry, use the --push option. If you’re going to load the image to your local machine, you can use the --load option.

Conclusion

In this article, I showed you how to construct Dockerfile to speed up the build process. I hope you found it helpful.

References