Blog

Cleaner Dockerfiles with HEREDOC

By Jay

Feb 7, 2024 | 8 minutes read

Series: Docker

Tags: blog, tech, docker

Today I want to talk about Dockerfiles. As a long time Docker user I’ve been writing, maintaining, and debugging these files for a decade. One of the downsides of being so familiar with Dockerfiles is that I often forget to check on new features and enhancements that have been added. Making the assumption that I’m not the only one who forgets, today I want to talk about HEREDOC support in Docker and how it can help to clean up your Dockerfiles. We’ll be looking at this via a common scenario - building a custom base image.

We have been tasked by the development lead to build a Mongodb image that can be used by the dev team as a base image. The goals here are to:

  • Create an image that can easily be updated as updates come from the upstream packages that are in use.
  • Have an easily parsed/understood Dockerfile so that anyone familiar with Docker can understand what is being done.
  • Make sure that the image is as efficient as possible (minimal dead space).

This is the first attempt. It does everything we need, but it is overly complex and not very efficient.

  • Each command results in a new layer for the image.
  • Delete at the end doesn’t free up any space, it just hides the usage.
  • No comments/documentation in the file.
FROM ubuntu
MAINTAINER Jay Schmidt

RUN apt update
RUN apt -y install apt-utils
RUN apt -y install gnupg curl
RUN curl -fsSL https://pgp.mongodb.com/server-7.0.asc | \
    gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor
RUN echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-7.0.list
RUN apt update
RUN apt -y install mongodb-org
RUN apt -y install htop vim git
RUN apt -y clean
RUN rm -rf /var/lib/apt/lists/*


CMD ["/usr/bin/mongod", "--config", "/etc/mongodb.conf"] 

file1

$ dive --ci new-directives:v1
  Using default CI config
Image Source: docker://new-directives:v1
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 94.4330 %
  wastedBytes: 51179794 bytes (51 MB)
  userWastedPercent: 6.2098 %
Inefficient Files:
Count  Wasted Space  File Path
    2         28 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy_universe_binary-arm64_Packages.lz4
    5        3.1 MB  /var/cache/debconf/templates.dat
    2        2.9 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy_main_binary-arm64_Packages.lz4
    2        2.4 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-updates_main_binary-arm64_Packages.lz4
    2        2.0 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-updates_universe_binary-arm64_Packages.lz4
    2        1.9 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-security_main_binary-arm64_Packages.lz4
    2        1.9 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-updates_restricted_binary-arm64_Packages.lz4
    2        1.9 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-security_restricted_binary-arm64_Packages.lz4
    2        1.6 MB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-security_universe_binary-arm64_Packages.lz4
    2        1.0 MB  /var/cache/debconf/templates.dat-old
    5        977 kB  /var/log/dpkg.log
    5        679 kB  /var/lib/dpkg/status
    4        582 kB  /var/lib/dpkg/status-old
    3        540 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy_InRelease
    2        356 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy_multiverse_binary-arm64_Packages.lz4
    3        238 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-updates_InRelease
    3        221 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-security_InRelease
    3        217 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-backports_InRelease
    5        104 kB  /var/log/apt/history.log
    2         80 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-backports_main_binary-arm64_Packages.lz4
    2         60 kB  /var/log/lastlog
    4         57 kB  /var/log/apt/term.log
    5         56 kB  /var/cache/debconf/config.dat
    2         47 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-updates_multiverse_binary-arm64_Packages.lz4
    2         42 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-backports_universe_binary-arm64_Packages.lz4
    2         42 kB  /var/lib/apt/lists/repo.mongodb.org_apt_ubuntu_dists_jammy_mongodb-org_7.0_multiverse_binary-amd64_Packages.lz4
    2         40 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy-security_multiverse_binary-arm64_Packages.lz4
    2         40 kB  /var/lib/apt/lists/repo.mongodb.org_apt_ubuntu_dists_jammy_mongodb-org_7.0_multiverse_binary-arm64_Packages.lz4
    2         39 kB  /var/lib/apt/lists/ports.ubuntu.com_ubuntu-ports_dists_jammy_restricted_binary-arm64_Packages.lz4
    5         33 kB  /var/log/apt/eipp.log.xz
    3         20 kB  /etc/ld.so.cache
    3         20 kB  /var/cache/ldconfig/aux-cache
    3         20 kB  /var/log/alternatives.log
    2         19 kB  /var/cache/debconf/config.dat-old
    5        8.8 kB  /var/lib/apt/extended_states
    2        6.5 kB  /var/log/faillog
    2        2.1 kB  /var/lib/apt/lists/repo.mongodb.org_apt_ubuntu_dists_jammy_mongodb-org_7.0_Release
    2        1.9 kB  /etc/passwd
    3        1.4 kB  /etc/group
    3        1.2 kB  /etc/gshadow
    2        1.0 kB  /etc/shadow
    2         929 B  /etc/group-
    2         866 B  /var/lib/apt/lists/repo.mongodb.org_apt_ubuntu_dists_jammy_mongodb-org_7.0_Release.gpg
    2         779 B  /etc/gshadow-
    2         508 B  /var/lib/dpkg/diversions
    2         261 B  /var/lib/dpkg/alternatives/pager
    6           0 B  /var/cache/apt/archives/lock
    5           0 B  /var/lib/dpkg/updates
    5           0 B  /var/lib/dpkg/triggers/Lock
    8           0 B  /var/lib/apt/lists/auxfiles
    5           0 B  /var/cache/debconf/passwords.dat
    6           0 B  /var/cache/apt/archives/partial
    7           0 B  /tmp
    4           0 B  /var/lib/apt/lists/lock
    4           0 B  /var/lib/apt/lists/partial
    5           0 B  /var/lib/dpkg/lock-frontend
    2           0 B  /var/lib/dpkg/triggers/Unincorp
    3           0 B  /etc/.pwd.lock
    2           0 B  /etc/alternatives/pager
    5           0 B  /var/lib/dpkg/lock
Results:
  PASS: highestUserWastedPercent
  SKIP: highestWastedBytes: rule disabled
  PASS: lowestEfficiency
Result:PASS [Total:3] [Passed:2] [Failed:0] [Warn:0] [Skipped:1]

This image is not very efficient as noted by the dive output:

Analyzing image...
efficiency: 94.4330 %
wastedBytes: 51179794 bytes (51 MB)
userWastedPercent: 6.2098 %

We can (and should) do better.

This file is cleaned up quite a bit; specifically

  • The commands are all joined together with ands (&&) so we minimize layers
  • Using the && insures we will fail out the build if the command fails

We are still missing some documentation.

FROM ubuntu
MAINTAINER Jay Schmidt

RUN apt update && \
 apt -y install apt-utils && \
 apt -y install gnupg curl && \
 curl -fsSL https://pgp.mongodb.com/server-7.0.asc |  \
 gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor && \
 echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-7.0.list && \
 apt update && \
 apt -y install mongodb-org && \
 apt -y install htop vim git && \
 apt -y clean && \
 rm -rf /var/lib/apt/lists/* 


CMD ["/usr/bin/mongod", "--config", "/etc/mongodb.conf"] 

file2

$ dive --ci jasonschmidt617/new-directives:v2
  Using default CI config
Image Source: docker://jasonschmidt617/new-directives:v1
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 99.8415 %
  wastedBytes: 2187086 bytes (2.2 MB)
  userWastedPercent: 0.2819 %
Inefficient Files:
Count  Wasted Space  File Path
    2        1.3 MB  /var/cache/debconf/templates.dat
    2        400 kB  /var/log/dpkg.log
    2        296 kB  /var/lib/dpkg/status
    2         60 kB  /var/log/lastlog
    2         42 kB  /var/log/apt/history.log
    2         20 kB  /var/cache/debconf/config.dat
    2         15 kB  /var/log/alternatives.log
    2         15 kB  /var/log/apt/eipp.log.xz
    2         14 kB  /etc/ld.so.cache
    2         14 kB  /var/cache/ldconfig/aux-cache
    2        6.5 kB  /var/log/faillog
    2        4.6 kB  /var/lib/apt/extended_states
    2        1.9 kB  /etc/passwd
    2        1.0 kB  /etc/shadow
    2         926 B  /etc/group
    2         776 B  /etc/gshadow
    2         508 B  /var/lib/dpkg/diversions
    2         261 B  /var/lib/dpkg/alternatives/pager
    2           0 B  /var/cache/debconf/passwords.dat
    2           0 B  /var/lib/dpkg/lock
    2           0 B  /var/cache/apt/archives/partial
    2           0 B  /var/cache/apt/archives/lock
    2           0 B  /tmp
    2           0 B  /var/lib/dpkg/lock-frontend
    2           0 B  /etc/.pwd.lock
    2           0 B  /etc/alternatives/pager
    2           0 B  /var/lib/dpkg/updates
    2           0 B  /var/lib/apt/lists
    2           0 B  /var/lib/dpkg/triggers/Unincorp
    2           0 B  /var/lib/dpkg/triggers/Lock
Results:
  PASS: highestUserWastedPercent
  SKIP: highestWastedBytes: rule disabled
  PASS: lowestEfficiency
Result:PASS [Total:3] [Passed:2] [Failed:0] [Warn:0] [Skipped:1]

We’ve managed to clean up the image quite a bit, with very little space wasted:

Analyzing image...
  efficiency: 99.8415 %
  wastedBytes: 2187086 bytes (2.2 MB)
  userWastedPercent: 0.2819 

If we want to clean up more space, we could go after the various log files shown in the CI output to reclaim that last 2.2 MB of space.

On the downside, all of the && lines make the Dockerfile somewhat difficult to read. Let’s check out how we can fix that with the HEREDOC support in the Dockerfile.

This Dockerfile builds on the lesson we’ve learned above - the image produced has very little wasted space due to our layer management. We are now going to use the HEREDOC support to clean up the Dockerfile, and add in some commentary.

# Use Docker's experimental syntax version 1.3 for Labs features
# syntax=docker/dockerfile:1.3-labs

# Base image is Ubuntu, the latest version available
FROM ubuntu

# Maintainer of the Dockerfile (MAINTAINER is deprecated, use LABEL instead)
LABEL MAINTAINER="Jay Schmidt"

# Set noninteractive to avoid prompts during package installation
ARG DEBIAN_FRONTEND=noninteractive

# Begin a multiline command
RUN <<EOF
# Update the package list
apt update 

# Install essential packages including utilities and Git
apt -y install apt-utils gnupg curl htop vim git 

# Add MongoDB's GPG key for verifying packages
curl -fsSL https://pgp.mongodb.com/server-7.0.asc | 
gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor

# Add MongoDB to the list of sources from which packages can be obtained
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-7.0.list

# Update the package list again, this time including MongoDB packages
apt update

# Install MongoDB
apt -y install mongodb-org

# Clean up to reduce the image size
apt -y clean 

# Remove the list of saved package repositories to reduce image size
rm -rf /var/lib/apt/lists/* 
EOF

# Define the default command to run MongoDB when the container starts
CMD ["/usr/bin/mongod", "--config", "/etc/mongodb.conf"]

file3

$ dive --ci jasonschmidt617/new-directives:v3
  Using default CI config
Image Source: docker://jasonschmidt617/new-directives:v3
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 99.8416 %
  wastedBytes: 2186265 bytes (2.2 MB)
  userWastedPercent: 0.2817 %
Inefficient Files:
Count  Wasted Space  File Path
    2        1.3 MB  /var/cache/debconf/templates.dat
    2        400 kB  /var/log/dpkg.log
    2        296 kB  /var/lib/dpkg/status
    2         60 kB  /var/log/lastlog
    2         42 kB  /var/log/apt/history.log
    2         20 kB  /var/cache/debconf/config.dat
    2         15 kB  /var/log/alternatives.log
    2         15 kB  /var/log/apt/eipp.log.xz
    2         14 kB  /etc/ld.so.cache
    2         14 kB  /var/cache/ldconfig/aux-cache
    2        6.5 kB  /var/log/faillog
    2        4.6 kB  /var/lib/apt/extended_states
    2        1.9 kB  /etc/passwd
    2        1.0 kB  /etc/shadow
    2         926 B  /etc/group
    2         776 B  /etc/gshadow
    2         508 B  /var/lib/dpkg/diversions
    2         261 B  /var/lib/dpkg/alternatives/pager
    2           0 B  /var/cache/debconf/passwords.dat
    2           0 B  /var/lib/dpkg/lock
    2           0 B  /var/cache/apt/archives/partial
    2           0 B  /var/cache/apt/archives/lock
    2           0 B  /tmp
    2           0 B  /var/lib/dpkg/lock-frontend
    2           0 B  /etc/.pwd.lock
    2           0 B  /etc/alternatives/pager
    2           0 B  /var/lib/dpkg/updates
    2           0 B  /var/lib/apt/lists
    2           0 B  /var/lib/dpkg/triggers/Unincorp
    2           0 B  /var/lib/dpkg/triggers/Lock
Results:
  PASS: highestUserWastedPercent
  SKIP: highestWastedBytes: rule disabled
  PASS: lowestEfficiency
Result:PASS [Total:3] [Passed:2] [Failed:0] [Warn:0] [Skipped:1]

This Dockerfile is good to go - we’ve cleaned up the file so it reads better and put in comments, and we are able to have a minimal amount of wasted space in the image.

This is a pretty extreme example, as we were able to leverage the HEREDOC support in the Dockerfile to collapse all the RUN statements. You will rarely be able to duplicate this effort, but what you can do is to leverage HEREDOC to wrap complex sets of commands along with any associated cleanup. Once this is done you can check to see if you can reorder any layers to achieve more efficiency.