Golden Gate Bridge

Multi Stage Container Builds

When it comes to your images, bigger is never better. With larger images comes longer deploy times. Larger images are naturally going to take longer to build and ultimately deploy over the network. In a data center with fast connections, pull time may not be noticed all that much, but in your CI/CD pipeline this slowness will be down right painful. From an ops perspective, there is no more annoying complaint than "deploys for my service are slow." While there are many reason docker images increase in size, we are going to focus on multiple stage docker builds.

Multiple stage docker deployments can slim down your image in a hurry, especailly if you are using a programing language that builds a single binary like golang. Once the initial binary is built, a lot of the accompaning files and OS packages aren't needed as much. Keep in mind that there are some things you may have to copy to your final image. In my experience, things like templates that are rendered on the fly and packages that pretain to things like certificate authorities will need to find their way in to the final image. Things like all of your .go files and your ReadMe don't have much value in your deployment artifact.

For now we are going to focus on a small golang api that runs on the gin-gonic framework. All this service will do for us is respoing "pong" to our "/ping" reguests. Super useful right?! For the purpose of this example we will refer to the following code snippit as our application.

  

    $ tree dkr-test
    dkr-test
    ├── Dockerfile
    ├── cmd
    │   └── main.go
    ├── go.mod
    └── go.sum

    1 directory, 4 files

    cat cmd/main.go
    package main

    import (
      "github.com/gin-gonic/gin"
    )

    func main() {
      router := gin.Default()
      router.GET("/ping", func(c *gin.Context) {
        c.JSON(http.StatusOK, gin.H{"message": "pong"})
      })
      router.Run(":8090")
    }
  
  

There shouldn't be any suprieses in the directory listed above. While it is a very stripped down app, it allows us to focus on the fundamentals of a multiple stage docker build. First lets take a look at the familiar bit. We'll be using the alpine distro for this excercise because it is already very much geared toward slim deployments. While it may not be right for your use, the patter should match well to any other distro. The top section of our Dockerfile will look very familiar. In this section there should only be two suprieses, the first stage doesn't reference command or entrypoint and the 'as factory' section on the FROM line.

  

    FROM golang:1.20-alpine as factory
    COPY . /app
    WORKDIR /app
    RUN apk update && apk upgrade
    RUN apk add git g++
    RUN go mod vendor
    RUN cd /app/cmd && go build -tags musl -a -race -o /app/dkr-test
  
  

Nice, only a few small paragraphs in and we are into what we're all here for. Let's take a look at the FROM line and the missing CMD and/or ENTRYPOINT. The only new addition in the FROM line comes in the for of 'as factory'. You can swap out the factory for almost any word that makes sense to you, but this is telling Docker "Hey, the things I'm going to buld in this stage, I wan't to refer back to them as the 'factory' stage". Later on we will see how we call back to the factory stage to pull out the neccisary parts. We also omit the CMD or ENTRYPOINT config here, because we aren't done building yet we have more config to write. At this point, once docker is finished with this build stage, we will essentially have a single stage Docker build with everything inside the current directory copied into the containers "/app" directory. This teamed with the os packages and go vendor files added in the RUN sections is pretty much unused by our applicaition, so why keep it around and cause it to slow down our CD pipeline.

The second stage of the build allows us to jettison the valuable parts that contributed toward our binary, but are no longer needed for the container to run properly in production. We will use the following config to get rid of the main thruster so we can make it to deployment.

  

    ### Final image
    FROM alpine:latest
    WORKDIR /app
    COPY --from=factory /app/dkr-test /app/
    EXPOSE 8080
    CMD ["/app/dkr-test"]
  
  

This section should also look very famiiar to the standard Dockerfile with the exception of the COPY line, so lets drill in there. "--from=factory" will reach back to the previous stage we defined as "factory". From there, in this case, we only care about the binary we built and output to the '/app/dkr-test' directory of the factory stage. This config will only grab the binary, so if you need extra files you would have to copy them in a similar fashion. At this point your final image should only incldue the binary all other files will be left behind in the factory stage.

Drum roll please... lets take a look at the size difference between the images built as a single stage and the one built as a multiple stage. From the following, you can see that our multi stage build saved us a ton of space. Your milage may vary, but this patter has been around for while (since about 2017) and is a great way to trim of some excess deploytime.

  

    $ docker images
    REPOSITORY                                                TAG                                                                          IMAGE ID       CREATED             SIZE
    dkr-test-single                                           latest                                                                       e8acc6a5a691   About an hour ago   813MB
    dkr-test-multi                                            latest                                                                       d9e898fad32e   2 hours ago         20.9MB
  
  

For debugging purposes, docekr has added some helpful commands to their buildKit. You don't have to wait for all the stages to complete before testing or need to do some magic editing to your Dockerfile. The following command will allow you to stop at the stage you defined as builder. On top of this there are some other pretty interesting commands to aid in multi stage Docker builds. Checkout Docker Docs

  

    $ docker build --target builder -t alexellis2/href-counter:latest .
  
  

Posts + Projects