How-tos

Container builds with multiple stages in IBM Cloud Container Registry

Share this post:

The IBM Cloud Container Registry team has been working to enable users to run their container builds in IBM Cloud. This capability was available to users of single containers or container groups, and we’re proud to announce that now cluster users can use it too. We’ve also taken the opportunity to add some new features. There’s a new command, bx cr build, and I’d like to highlight one of the new features that can help simplify your container builds.

When you’re building Docker images, it’s generally a good idea to keep the image as minimal as possible. Smaller images are faster to deploy, and less running components means fewer vulnerabilities that could be used to break into your system. It’s best to keep anything that your app doesn’t need at runtime out of your final Docker image, and that includes your build environment.

It’s typical to use a shell script to perform some actions before the result is used in your container build. For example, you might build a binary in the shell script, and then add the built binary to your Docker image, so that you don’t need to put the source code inside the image. However, because your shell script doesn’t run inside a container, you must make sure that it works on both your developers’ machines and your continuous integration (CI) pipeline.

It doesn’t have to be this way.

Docker 17.05 Edge and 17.06 Stable include a new feature called multi-stage builds, which aims to do away with those shell scripts. You can specify multiple FROM lines in your Dockerfile as separate stages, build something in each stage, and then add the results into a minified image at the end of the build. Because everything runs in Docker, it ensures that the build result is repeatable, so that you can be sure that a container build that works on your workstation will work on someone else’s, or in your CI pipeline.

The setup

So let’s convert this build.sh and Dockerfile combination into a multi-stage build:

build.sh

git clone myamazinggolangproject.git
cd myamazinggolangproject
go get –tu
go test ./…
CGO_ENABLED=0 GOOS=linux /usr/local/go/bin/go build -a
docker build –no-cache –pull –t registry.com/image:latest .

Dockerfile

FROM scratch
COPY myamazinggolangproject /myamazinggolangproject
ENTRYPOINT [“/myamazinggolangproject”]

The process

In short, you want everything before the docker build command in the build.sh file to be in an intermediate step in the Dockerfile. You need to run go build in an environment that has Go installed, so let’s use the golang official Docker image as the base for your first stage – and let’s use a specific version of that image to make sure that you don’t accidentally switch versions when you’re not expecting to.

You could just have one build stage here, build from golang, put the source code in, run go build, and be done with it. But the golang image is fairly large, because it contains the entire build and test runtime, and might result in a Docker image that is hundreds of megabytes in size at the end of your build. Plus, your Go source files end up inside the image. If you build from scratch, certain Go binaries can run inside a container that’s no larger than the binary itself. Due to their smaller size, minified images can reduce your storage costs and improve your deployment times significantly.

 The result


FROM golang:1.8 as gobuild
WORKDIR /go/src/github.com
RUN git clone github.com/myamazinggolangproject.git
WORKDIR /go/src/github.com/myamazinggolangproject
RUN go get –tu
RUN go test ./…
RUN CGO_ENABLED=0 GOOS=linux /usr/local/go/bin/go build –a –o app

FROM scratch
COPY –from=gobuild /go/src/github.com/myamazinggolangproject/app /app
ENTRYPOINT [“/app”]

Notice that the first stage is named “gobuild”, and it is referred to by name in the second stage. This name tells Docker which image to refer to. If you don’t specify names for your stages, you can also refer to them by number, starting from zero:
COPY -from=0 […]

That’s it!

The build steps are effectively the same, but you’re running each line of the build.sh as a RUN instruction in the image. Now, on any system that supports Linux containers, you can build the image by using
docker build –no-cache –pull –t registry.com/image:latest .
and you’ll get reproducible results.

Here’s the great thing: those reproducible results even extend out to the IBM Cloud. This means that you can be sure that you’ll get the same result from running
bx cr build –no-cache –pull –t registry.com/image:latest .
as you would have done locally. Plus, more of your build runs in the cloud, so you get more benefit from using our powerful servers.

You can get started with multi-stage builds on your workstation, or by using bx cr build in IBM Cloud today! If you want to use Docker on your workstation, make sure you update to the latest version of Docker. If you want to use IBM Cloud to run your builds, you don’t have to install Docker on your workstation.

Useful links

IBM Cloud Container Registry docs
Docker docs
Compare notes with fellow Devs on the product team

More How-tos stories
September 18, 2018

How To Convert CSV Objects to Parquet in Cloud Object Storage

If you're looking to lower storage costs by compressing your data and get better query performance when querying the data in Cloud Object Storage, you may want to click to learn how to convert CSV objects to Parquet.

Continue reading

September 17, 2018

How To Consolidate PostgreSQL Data to IBM Cloud Object Storage for IBM Cloud SQL Query

Consolidating PostgreSQL data into a durable and highly reliable data storage service like IBM Cloud Object Storage will not only reduce costs, but it provides a flexible, durable, and scalable solution for storing all sorts of unstructured data.

Continue reading

August 30, 2018

Querying Your Distributed Application Data Using IBM Cloud Object Storage and IBM SQL Query

In this third part of a four-part series on Operationalizing SQL Query, we'll bring together the microservices we deployed in Part 1 to query data in IBM Cloud Object Storage (COS) using the techniques we developed in Part 2 using IBM SQL Query with the goal of connecting our application's data to Business Intelligence (BI) tools.

Continue reading