10 best practices for creating good docker images


Today, a colleague asked me some questions concerning docker, and I decided to sum them up in an article, since they might be useful to someone else, too.
Docker docs already contain a section Best practices for writing Dockerfiles containing a lot of useful information, but it’s far from complete, in my humble opinion.
So here we go:

1 – Carefully look at how your PID 1 handles UNIX signals

Docker containers are intended to run only one single process. The process ID (or PID) of this process is 1 by definition. When you issue a docker stop command to your docker container, the docker demon sends a SIGTERM signal to your PID 1 process. When the process hasn’t exited after a timeout period (10 seconds), it sends a SIGKILL to the process, thus killing it forcibly. In this case, your container will not be able to shutdown cleanly, since all registered signal handlers won’t fire.

A mysql container, for example, will not be able to shutdown the database cleanly in this case, and a tomcat container will not call the JAVA shutdown handlers. The desired behavior for the PID 1 process is to (gracefully) shutdown all child processes and then exit.

All containers running a shell (no matter if it’s sh or bash) as PID1 will not comply to this requirement. While it’s possible to configure a shell to react correctly to SIGTERM events, it’s probably easier to use tools specifically designed for that purpose:

2 – Use dumb-init

dumb-init was created to tackle exactly this problem. Or, as its creator puts it: “dumb-init is a simple process supervisor and init system designed to run as PID 1 inside minimal container environments (such as Docker).”

dumb-init is quite simple to use: use it as your container’s entry point, like this:

# Runs "/usr/bin/dumb-init -- /my/script --with --args"
ENTRYPOINT ["/usr/bin/dumb-init", "--"]

# or if you use --rewrite or other cli flags
# ENTRYPOINT ["dumb-init", "--rewrite", "2:3", "--"]

CMD ["/my/script", "--with", "--args"]

When your container starts, dumb-init will forward the SIGTERM signal it receives when you issue a docker stop command to it’s child processes, causing them to exit gracefully.

Note: Docker 1.13 and above contains tini (tiny init) which does the same thing as dumb-init. Haven’t tried it myself, though.

3 – concat statements in dockerfiles

This one’s quite common knowledge already: In your Dockerfile, try to connect as many statements as possible with && \ , like in this snippet from an AIT Liferay Docker image:

RUN apk update && \
 apk add ca-certificates curl && \
 curl -o /tmp/liferay.zip https://netix.dl.sourceforge.net/project/lportal/Liferay%20Portal/7.0.3%20GA4/liferay-ce-portal-tomcat-7.0-ga4-20170613175008905.zip && \
 cd /opt && \
 unzip /tmp/liferay.zip && \
 mkdir ${LIFERAY_BASE}/scripts && \
 mkdir /var/liferay-home && \
 echo "liferay.home=/var/liferay-home" > /opt/liferay-ce-portal-7.0-ga4/tools/portal-tools-db-upgrade-client/portal-upgrade-ext.properties && \
 mv ${LIFERAY_DIR}/data /var/liferay-home/ && \
 mv ${LIFERAY_DIR}/osgi /var/liferay-home/ && \
 mv ${LIFERAY_DIR}/tomcat-8.0.32/conf /var/liferay-home/ && \
 mv ${LIFERAY_DIR}/tomcat-8.0.32/webapps /var/liferay-home/ && \
 rm /tmp/liferay.zip && \
 rm -rf /var/cache/apk/*

This minimizes the number of layers that your Docker image will consist of eventually. Note, though, that this will affect cachability during docker build, but be sure to use it in production images.

4 – remove everything you don’t need

This one’s also not new, but still important and often enough ignored, leading to unnecessary large docker images: at the end of your Dockerfile, remove everything you don’t need when running your container, which includes:

  1. Package manager caches ( apt or apk in alpine linux )
  2. All packages needed for build-time only ( compilers, kernel-headers etc. )
  3. All packages of the default install which are not needed by your container at runtime.

Following this practice, your Dockerfile could look like this:

FROM debian:8
RUN apt-get update && apt-get -y upgrade && \
    apt-get -y install $BUILD_PACKAGES && \
    do-your-stuff && \
    apt-get remove --purge -y $BUILD_PACKAGES && \
    apt-get autoremove -y && \
    rm -rf /var/lib/apt/lists/*

There are also a lot of good articles concerning practices to make Docker images smaller, one of them, although a bit dated is Dave Beckett’s well written blog entry.

5 – use supervisord when you need to run several services in a single container

supervisord is a quite useful bit when it comes to running more than one processes in your docker container. It controls a number of processes by use of supervisord.conf file and restarts them, if necessary. Using this setup, you could run a crond besides your httpd in a single container.

Note that it’s a good practice to use one container for a single concern, so running multiple processes in a single container might not generally be good practice, but in some cases, it might still be necessary.

6 – use named volumes over host volumes, whenever possible

Docker knows two types of volumes: named volumes and host volumes.

Prefer named volumes over host volumes wherever possible:

  1. Named volumes can be directly controlled (created, removed) by the docker volume command
  2. Named volumes are independent of any paths on the host
  3. Named volumes can be backed up and restored easily
  4. Named volumes will automatically be created by your docker-compose file

In general, named volumes will make your docker setups much more portable, making sharing your docker containers and docker-compose-files much more frictionless.

7 – write a useful entrypoint script

Writing entry point scripts is a high art and usually requires wizardry in shell scripting. Especially when linking containers (e.g. a web app and a database) in a docker-compose file, there’s a lot of potential problems to solve for a good startup script:

  1. Check preconditions before you fire up your main process
    Containers tend to take a certain amount of time until they’re ready to fulfill their purpose. When your web app container tries to access your database before it’s ready, it will most likely fail. So, if you depend on a database in a webapp container, check if it’s ready before you start your webapp.
  2. Make use commands and args in your entrypoint script
    The docker run command takes optional commands or other args ( docker run <image> cmd ) which are available in your entrypoint script. Use them to provide different modes of operation for your container (‘upgrade’, ‘check’) to make it’s usage more versatile.
  3. Use exec in your entrypoint script
    Exec replaces the current process (your entry point script) with the one you call at the end of your entrypoint script. So “exec httpd” effectively terminates your entrypoint script and uses the http process as its replacement in the process hierarchy. This is necessary for correct handling of signals (SIGTERM).

8 – Don’t use root user, when possible

This is mostly a security practice. It takes a bit of extra effort in your Dockerfile, but it’s worth it.
Gerard Braad has written a very comprehensive article about that, so I won’t elaborate further on this.

9 – When using dev containers, use a welcome message

Dev-containers are a special breed of containers. We at AIT use them mainly to provide development environments for specific scenarios. To develop web applications for example, we use containers with all the huge toolchains (gulp, npm, …) installed in the correct versions, so that developers can just fire up the dev container, exec into it, and start developing right away, without cluttering their machines with all the tools needed to develop a modern web app. When they’re done, they remove the container, and everything (including .npm caches and stuff) is gone and doesn’t fill up your hard drive any more.

For these containers, it’s useful to provide welcome messages to give the developers all the necessary information right away:

~# docker exec -ti my-dev bash
* This is the XXX dev environment
* common commands here are:
* 'blade server start -b' starts a liferay instance in liferay workspace
* after startup, server is available under http://localhost:8280
* 'blade server stop' stops a running liferay instance in liferay workspace
* 'log_tomcat_catalina.sh' tails you on the catalina.sh of liferay
* 'log_tomcat_access.sh' tails you on the tomcat access log
* 'blade gw tasks' shows you the gradle tasks
* 'blade samples' shows you available project samples
* 'blade samples blade.portlet.ds' creates a samples module (must be in /development/workspace/modules for that'
* 'cd /development/workspace;./gradlew initBundle' creates a tomcat bundle with current configuration
* more information on Liferay 7 development: https://dev.liferay.com/develop/tutorials/-/knowledge_base/7-0/tooling
* more information on Workspace: https://dev.liferay.com/develop/tutorials/-/knowledge_base/7-0/liferay-workspace
* more information on blade: https://dev.liferay.com/develop/tutorials/-/knowledge_base/7-0/blade-cli

To display this message to the developer, when he exec’s into the container, a small snippet is necessary in your Dockerfile:

/* copy motd file in the home dir of the docker container user ('development') */
COPY files/motd /development/
/* at the end of your Dockerfile, copy the command to .bashrc */
echo "cat /development/motd\n" >> /development/.bashrc

Small productivity boosters like this save huge amounts of time during a year …

10 – Use Alpine Linux as much as possible

Alpine Linux has been hugely successful recently, riding on Docker’s tremendous wave of success. It is a close-to-perfect linux distribution for containers. It has an incredibly small footprint, a package system, and is the basis for an ever-increasing amount of docker images on docker hub. There are even complete alpine-oracle-java images, which are the basis for nearly all java-related Docker images here at AIT. Use it wherever possible.

Summing up

Docker has rightfully revolutionized the way we run software. But getting into its deeper (and sometimes darker) secrets is hard, so my intention was to sum up some of the experiences we gathered here over three years of docker adoption. Hope it helps!

Über den Autor

Michael Hager
Michael Hager

Michael Hager ist Gründer und Geschäftsführer von artindustrial informationstechnologien GmbH in Wien, Österreich.

Seine Tätigkeitsgebiete umfassen Cloud Computing, Angular2 Applikationsentwicklung und Java-Applikationsentwicklung.

Michael Hager Von Michael Hager

Hier finden Sie uns

Mariahilferstrasse 111/6-7
1060 Wien

Montag – Freitag: 9–17 Uhr

Mail: office@artindustrial-it.com
Tel.: +43 1 5954023


Hier finden Sie Unterstützung zu unseren angebotenen Services.

Folgen Sie uns