Docker multi-stage builds and BuildKit. Shaving off the megabytes.
There’s an earlier post here on Docker OS images from 2017. Most of it still holds. The interesting development since then is that two features — multi-stage builds and BuildKit — make small images much easier than they used to be.
This is a quick walk through both, because Dockerfiles in production still routinely skip them.
The naïve version
Take a Go web server. A first-pass Dockerfile usually looks like this:
FROM golang:1.23
WORKDIR /src
COPY . .
RUN go build -o app ./cmd/server
EXPOSE 8080
CMD ["./app"]It works. docker build . produces an image. The image is about 950 MB. The actual binary is 14 MB. We are shipping the Go compiler, the standard library source, and a Debian userland to production, every time, because we needed them once during the build.
Multi-stage builds
Multi-stage builds let you define more than one FROM in a Dockerfile. Only the final stage becomes the image. The earlier stages are scratch pads. You can copy artifacts out of them.
# Stage 1: build
FROM golang:1.23 AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /out/app ./cmd/server
# Stage 2: runtime
FROM gcr.io/distroless/static-debian12
COPY --from=build /out/app /app
EXPOSE 8080
ENTRYPOINT ["/app"]The runtime stage is distroless — no shell, no package manager, just glibc-less static binaries can live there. The final image is around 17 MB. Same binary, same behavior, 55× smaller.
A handful of details that matter:
CGO_ENABLED=0— without this Go will dynamically link against the build image’s glibc and your binary won’t run indistroless/static. This catches people.- Copy
go.modandgo.sumbefore the rest of the source, then rungo mod download. This caches dependencies separately from your code so you don’t re-download them every time you change a Go file. - Pin the base image.
golang:1.23is fine for a personal blog example. For production, pin to a digest.
BuildKit
The classic Docker builder built each stage sequentially, top to bottom. BuildKit is the modern builder that ships with Docker by default since 23.0. It does a few things the old one didn’t:
- Parallel stages — independent stages build at the same time.
- Mount caches — you can mount a directory across builds for things like
/root/.cache/go-buildor/var/cache/apt. - Mount secrets — pass a secret into a build without it ending up in a layer.
- SSH forwarding for private dependencies, without baking your SSH key into a layer.
The mount cache is the one I would highlight if you have one minute. Here is how it looks for a Node build:
# syntax=docker/dockerfile:1.6
FROM node:20 AS build
WORKDIR /src
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci
COPY . .
RUN npm run buildThat --mount=type=cache gives the RUN step a persistent cache directory that survives across builds but does not end up in any image layer. On a CI runner this turns a 90-second npm ci into a 5-second one for unchanged dependencies.
The # syntax= line at the top is not a comment — it tells BuildKit which Dockerfile frontend to use. Without it you don’t get the --mount flag and you’ll wonder why your Dockerfile errors out with a syntax error.
A quick checklist
Before you ship a Dockerfile, walk down this list:
- Is the runtime image
distroless,alpine, orscratch? If it hasapt-getin it, why? - Is the dependency install step before the source copy, so it caches independently of your code?
- Do you have a
.dockerignore? Without one,COPY . .happily copies yournode_modulesand.gitinto the build context. - Is the final
ENTRYPOINTa non-root user? If not, addUSER 1000.
None of this is new. It has all been in the Docker docs since around 2019. It is, however, still missing from most of the Dockerfiles I look at.