If you've ever used Docker Hub to spin up containers, you've probably relied on ready-to-go images.
But when you need something custom, Dockerfiles are your go-to tool. In this post, we’ll explore what Dockerfiles are, how to use them, and the best practices to follow.
In a nutshell, the procedure for creating a new image looks like the following: You set up the Dockerfile file in a separate directory. There you write the properties of your image using keywords. docker build creates the local image. Using docker push you can eventually upload it to the public image collection of Docker, if you want to.
Dockerfile Syntax
The Dockerfile determines the features of custom images. Below we summarize the most important keywords.
- ADD: Copies files to the file system of the image
- CMD: Executes the specified command at container startup
- COPY: Copies files from the project directory to the image
- ENTRYPOINT: Always executes the specified command at container startup
- ENV: Sets an environment variable
- EXPOSE: Specifies the active ports of the container
- FROM: Specifies the base image
- LABEL: Sets a character string
- RUN: Executes the specified command
- USER: Specifies the account for RUN, CMD, and ENTRYPOINT
- VOLUME: Specifies volume directories
- WORKDIR: Sets the working directory for RUN, CMD, COPY, etc.
Numerous other keywords and details can be found in the official documentation.
In addition, it’s certainly worth taking a look at the best practices for dealing with Dockerfiles.
An easy way to familiarize yourself with the Dockerfile syntax is to go to Docker Hub and look at the Dockerfiles of images that perform similar tasks to your own image.
Introductory Example
A minimal Dockerfile that extends the Ubuntu base image with the package of the joe editor looks like the following:
# Dockerfile file
FROM ubuntu:20.04
LABEL maintainer "name@somehost.com"
RUN apt-get update && \
apt-get install -y joe && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
CMD ["/bin/bash"]
You can use RUN to specify commands that will be executed once when the new image gets created. These commands are executed in the container that has been temporarily set up for image creation, that is, in the guest system, not in the host system. Often these are commands for installing packages or compiling/configuring programs.
Specifying the Source Image (“FROM”)
Custom images are mostly derived from other images. You can specify the image name using FROM. If possible, you should use official, well-maintained images.
Adding Files (“ADD” versus “COPY”)
At first glance, the ADD and COPY commands seem to perform the same task: You copy files to the file system of the image you want to create. ADD is the more flexible command, which differs from the simple COPY in three aspects:
- As a source parameter, ADD not only accepts a local file but also a URL. This allows for downloading files from the internet and copying them to the image file system.
- If the source parameter of ADD is a directory, the entire contents of that directory will be copied to the image.
- If the source file with ADD is a local TAR archive, it will be unpacked automatically. This also works for archives compressed with the gzip, bzip2, or xz methods.
The Dockerfile documentation recommends using ADD only when one of these additional features is used. For the simple copying of a local file, you should use the COPY command instead.
For both ADD and COPY, you can use --chown=user:group to specify which account and group the file should be associated with in the image file system.
Container “start” Command (“CMD” and “ENTRYPOINT”)
The question of which program will be executed when a container is run for the first time using run or later on with start has a bewildering number of answers:
- You can specify a default command in the Dockerfile using the CMD and/or ENTRYPOINT keywords.
- When setting up the container, you can specify an alternate command for the CMD variant or add further parameters to it for the ENTRYPOINT variant.
- If necessary, you can also replace the ENTRYPOINT command with your own command if you pass it to docker run with the --entrypoint option.
- While a container is running, you can use docker exec to execute any other command.
The recommended syntax for ENTRYPOINT or CMD is to put the complete file name of the command and its parameters in double quotes each and pass them in square brackets separated by commas:
CMD ["/bin/ls", "/var"]
CMD vs. ENTRYPOINT: What’s the Difference?
CMD and ENTRYPOINT seem to perform the same task. However, there’s one major difference:
- CMD: The command you pass to docker run after the image name is executed instead of CMD.
- ENTRYPOINT: The parameters passed to docker run are added to the ENTRYPOINT.
A multiple specification of CMD or ENTRYPOINT isn’t intended. If that does happen, the last statement of that kind will apply.
Real-World Usage in Base Images
In practice, the ENTRYPOINT isn’t defined for many base images, which means it’s empty. CMD contains the name of a shell, that is, /bin/sh or /bin/bash. Similarly, with images for programming languages, ENTRYPOINT is mostly undefined, while CMD either starts a shell of the programming language or is also empty.
For images for server services, on the other hand, ENTRYPOINT often refers to a shell script that first takes care of various initialization tasks (see the table below). Once these have been completed, the command specified with CMD gets executed, which means that usually the server service gets started. If no initialization work is required, ENTRYPOINT is empty.
Image | ENTRYPOINT | CMD |
Alpine Linux | ["/bin/sh"] | |
Apache | ["httpd-foreground"] | |
Debian | ["/bin/bash"] | |
Nginx | ["nginx", "-g", "daemon off;"] | |
MySQL/MariaDB | ["docker-entrypoint.sh"] | ["mysqld"] |
Nextcloud | ["entrypoint.sh"] | ["apache2-foreground"] |
Node.js | ["node"] | |
OpenJDK (Java) | ["jshell"] | |
Oracle Linux | ["/bin/bash"] | |
PHP | ["docker-php-entrypoint"] | ["php", "-a"] |
Python | ["python3"] | |
Ubuntu | ["/bin/bash"] | |
WordPress | ["docker-entrypoint.sh"] | ["apache2-foreground"] |
The next table shows how different specifications for ENTRYPOINT, CMD, and the docker run parameter result in the command to be executed. Note that the RUN keyword has nothing to do with CMD and ENTRYPOINT! RUN specifies commands to be executed once when the image is created. CMD or ENTRYPOINT, on the other hand, specify the command to be executed later when the container is started.
ENTRYPOINT | CMD | RUN Parameter | To Be Executed |
["script.sh"] | script.sh | ||
["script.sh"] | /bin/bash | script.sh/bin/bash | |
["script.sh"] | ["mysqld"] | script.mysqld | |
["script.sh"] | ["mysqld"] | /bin/bash | script.sh/bin/bash |
["/bin/sh"] | /bin/sh | ||
["/bin/sh"] | /bin/bash | /bin/bash |
Shell Variant of CMD and ENTRYPOINT: The Docker documentation recommends passing the command to be executed and its parameters to CMD and ENTRYPOINT, respectively, in square brackets and double quotes, as in the preceding examples.
However, there’s a second type of syntax according to which you simply pass the command without brackets and quotes, such as CMD ls /etc/*. In that case, the command will be executed via a shell. When that happens, it’s not the command that receives process ID 1, but the shell.
The shell variant has its pros and cons. The advantages include the fact that the substitution mechanisms known from the shell work without a problem. For example, * is replaced by file names, $VAR is replaced by the contents of the environment variable, and so on. In addition, you can omit specifying the full path of the command—the shell will find the command if it’s located in one of the PATH directories.
The biggest drawback is that there’s no signal redirection: (Ctrl)+(C) for an interactive container or docker stop for a noninteractive container stops the shell, but they don’t allow the actual command to be executed to perform the signal processing itself.
Running Commands (RUN)
The commands specified with RUN are executed when the image gets created. Often these are commands for installing packages. Because the process is automated, you need to avoid interactive queries. For many package-management commands, the -y option is sufficient for this (yes, that is, answer all queries in the affirmative).
RUN apt-get update -y && \
apt-get install -y --no-install-recommends \
subversion \
joe \
vim \
less && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Each RUN command adds another layer to your image, that is, a file system layer. However, a large number of layers will make images inefficient. For this reason, it’s recommended to combine as many statements as possible in one RUN command, as in the preceding example.
To prevent images from becoming larger than necessary, all cache files should also be deleted immediately (here, with apt-get clean and rm). If these files end up in a layer, they’ll unnecessarily inflate the image.
Volume Directories (VOLUME)
VOLUME specifies which directories are to be mapped as volumes in the file system of the host. As with CMD, you can specify multiple directories in square brackets and each in double quotes:
VOLUME ["/var/lib/mysql", "/var/log/mysql"]
Where the volumes actually end up in the host file system depends on how the container is set up. If you run docker run or docker create without the -v option, Docker sets up a directory with a random UID for each volume (/var/lib/docker/volumes/uid). The user of the image can also specify the desired location in the host by using the -v option:
docker run ... -v /myvolumes/mysql:/var/lib/mysql \
-v /myvolumes/log:/var/log/mysql imagename
Fedora and Red Hat Enterprise Linux (RHEL) users must note that Security-Enhanced Linux (SELinux) doesn’t typically allow volumes outside of /var/lib/docker. The solution in such cases is the additional flag :z, which you add to the volume option (i.e., -v/myvolumes/mysql/:/var/lib/mysql:z).
Creating and Testing an Image
You usually run docker build in the directory where the Dockerfile is located. As a parameter, you can pass a dot that you use to point to the directory containing the Dockerfile. Using -t, you can specify the desired image name. For accountname, you should use the name of your Docker account. If you haven’t set up anaccount yet, you can enter any name for the time being or omit the account name altogether.
cd project directory
docker build -t accountname/imagename .
We’ll use the account name koflerinfo and the image name ubuntu-joe in the further course of this example:
docker build -t koflerinfo/ubuntu-joe .
Debugging Your Build
If errors occur while running docker build, you must correct the Dockerfile and repeat the build process. Docker takes a fairly intelligent approach to this by creating an interim image in a cache directory for each statement in the Dockerfile. If statements remain unchanged after the Dockerfile has been modified, docker build can continue to use the corresponding images from the cache. (You can prevent this with the –nocache option if necessary.)
In this respect, it’s recommended for debugging to avoid a complicated and errorprone RUN command and instead to provide multiple RUN commands for each substep.
This approach can unfortunately lead to images that are larger than absolutely necessary. This is especially the case when something is installed or compiled in one RUN command and cleanup is performed in a second RUN command. Docker isn’t able to reasonably clean up files added in one delta image and deleted in another delta image. So, if your Dockerfile works fine, you should run all the commands together in one long RUN command.
docker history imagename provides a list of the interim images and the commands executed for them:
docker history koflerinfo/ubuntu-joe
Verifying the Image with docker run
To try out the successfully created image, you should create a container from it and test it—in this tiny example, you can do that by calling the jmacs editor from the joe package:
docker run -it --rm koflerinfo/ubuntu-joe
root@b1c2c0a04f47:/# jmacs /etc/os-release
...
root@b1c2c0a04f47:/# <Strg>+<D>
“docker build” versus “docker buildx”
The docker build command provides the easiest way to create custom images. In parallel, Docker provides the docker buildx command, which, in turn, contains various subcommands where docker buildx build is largely compatible with docker build.
Among other things, using docker buildx is recommended if you want to create images in parallel for multiple CPU platforms. It also provides various optimization features that are missing in docker build. More details about docker buildx can be found here:
- https://docs.docker.com/buildx/working-with-buildx
- https://docs.docker.com/engine/reference/commandline/buildx
- https://github.com/docker/buildx
Tidying Up
It often takes several attempts before the new image works as it should. Keep in mind that every now and then, you should delete all containers and images that are no longer needed. (In the following commands, you need to replace accountname/imagename with your designations, of course. If the commands return error messages, then there was nothing to delete.)
docker rm $(docker ps -a -q -f ancestor=accountname/imagename)
docker rmi \
$(docker images accountname/imagename -f dangling=true -q)
Want to dive deeper? Check out our full book Docker: Practical Guide for Developers and DevOps Teams for expert tips, detailed examples, and advanced workflows.
Editor’s note: This post has been adapted from a section of the book Docker: Practical Guide for Developers and DevOps Teams by Bernd Öggl and Michael Kofler. Bernd is an experienced system administrator and web developer. Since 2001 he has been creating websites for customers, implementing individual development projects, and passing on his knowledge at conferences and in publications. Michael studied telematics at Graz University of Technology and is one of the most successful German-language IT specialist authors. In addition to Linux, his areas of expertise include IT security, Python, Swift, Java, and the Raspberry Pi. He is a developer, advises companies, and works as a lecturer.
This blog post was originally published 5/2025.
Comments