Sean Marman

When it comes to your images, bigger is never better. With larger images comes longer deploy times. Larger images are naturally going to take longer to build and ultimately deploy over the network. In a data center with fast connections, pull time may not be noticed all that much, but in your CI/CD pipeline this slowness will be down right painful. From an ops perspective, there is no more annoying complaint than "deploys for my service are slow." While there are many reason docker images increase in size, we are going to focus on multiple stage docker builds.

Multiple stage docker deployments can slim down your image in a hurry, especailly if you are using a programing language that builds a single binary like golang. Once the initial binary is built, a lot of the accompaning files and OS packages aren't needed as much. Keep in mind that there are some things you may have to copy to your final image. In my experience, things like templates that are rendered on the fly and packages that pretain to things like certificate authorities will need to find their way in to the final image. Things like all of your .go files and your ReadMe don't have much value in your deployment artifact.

For now we are going to focus on a small golang api that runs on the gin-gonic framework. All this service will do for us is respoing "pong" to our "/ping" reguests. Super useful right?! For the purpose of this example we will refer to the following code snippit as our application.

There shouldn't be any suprieses in the directory listed above. While it is a very stripped down app, it allows us to focus on the fundamentals of a multiple stage docker build. First lets take a look at the familiar bit. We'll be using the alpine distro for this excercise because it is already very much geared toward slim deployments. While it may not be right for your use, the patter should match well to any other distro. The top section of our Dockerfile will look very familiar. In this section there should only be two suprieses, the first stage doesn't reference command or entrypoint and the 'as factory' section on the FROM line.

Nice, only a few small paragraphs in and we are into what we're all here for. Let's take a look at the FROM line and the missing CMD and/or ENTRYPOINT. The only new addition in the FROM line comes in the for of 'as factory'. You can swap out the factory for almost any word that makes sense to you, but this is telling Docker "Hey, the things I'm going to buld in this stage, I wan't to refer back to them as the 'factory' stage". Later on we will see how we call back to the factory stage to pull out the neccisary parts. We also omit the CMD or ENTRYPOINT config here, because we aren't done building yet we have more config to write. At this point, once docker is finished with this build stage, we will essentially have a single stage Docker build with everything inside the current directory copied into the containers "/app" directory. This teamed with the os packages and go vendor files added in the RUN sections is pretty much unused by our applicaition, so why keep it around and cause it to slow down our CD pipeline.

The second stage of the build allows us to jettison the valuable parts that contributed toward our binary, but are no longer needed for the container to run properly in production. We will use the following config to get rid of the main thruster so we can make it to deployment.

This section should also look very famiiar to the standard Dockerfile with the exception of the COPY line, so lets drill in there. "--from=factory" will reach back to the previous stage we defined as "factory". From there, in this case, we only care about the binary we built and output to the '/app/dkr-test' directory of the factory stage. This config will only grab the binary, so if you need extra files you would have to copy them in a similar fashion. At this point your final image should only incldue the binary all other files will be left behind in the factory stage.

Drum roll please... lets take a look at the size difference between the images built as a single stage and the one built as a multiple stage. From the following, you can see that our multi stage build saved us a ton of space. Your milage may vary, but this patter has been around for while (since about 2017) and is a great way to trim of some excess deploytime.

For debugging purposes, docekr has added some helpful commands to their buildKit. You don't have to wait for all the stages to complete before testing or need to do some magic editing to your Dockerfile. The following command will allow you to stop at the stage you defined as builder. On top of this there are some other pretty interesting commands to aid in multi stage Docker builds. Checkout Docker Docs

Multi Stage Container Builds

Posts + Projects

Argocd Applications

Argo Workflows

Github actions

Golang API

Helm Manifests

Useful kubectl Commnads

RBAC in K8s

Multi Stage Container Builds

Secure Image Builds