Notice: Our URL has changed! Please update your bookmarks. Dismiss

OCI Container Image and Build Guidelines

This document is a guide for building OCI container images for OpenShift based on lessons learned from enterprise container adoption delivery.

It all starts with a basic file

As a preamble to getting your feet wet with containers, we suggest you start here. This website will give you a good foundation in moving forward. As noted on Docker’s Dockerfile reference, " A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build users can create an automated build that executes several command-line instructions in succession ." With that definition as a backdrop, this guide will walk us through some best practices and considerations as we generate this unique text file. The output of this Dockerfile is an image or container that can be run using the LXC (LinuX Container) environment that is part of the kernel. Further details of LXC can be found here.

We normally start with an existing image that is stored in a registry like the one provided by Red Hat here. The images stored here were also started from a “base” image. Each image instantiated by adding to a new formed Dockerfile is another layer(s) added to the starting base image. Let’s begin to build our very first image by starting with an example Dockerfile and get into the details as we go along. Edit our file with an IDE or plain text editor like below:

vim Dockerfile

FROM rhel7.5
MAINTAINER Matt Witz <mattwitz@example.com>

The FROM line is required to start from another image someone has created. The MAINTAINER line helps track changes to this new “base” image Now that we have these first two steps inside our file, we could run this container with a simple command of: docker build --rm -t rhel . (notice the . for the current working directory where our Dockerfile exists). This image does not really do anything so let’s make it do something interesting like run a web server:

FROM rhel7.5
MAINTAINER Matt Witzmailto:mattwitz@example.com <mattwitz@example.com>
LABEL version="1.0"
LABEL description="First image with Dockerfile."
# Add Web server, update image, and clear cache
RUN yum -y install httpd && yum -y update; yum clean all
# Add some data to web server
RUN echo "This Web server is working." > /var/www/html/index.html
EXPOSE 80
ENTRYPOINT [ "/usr/sbin/httpd" ]
CMD [ "-D", "FOREGROUND" ]

We have added some very important lines to the file that we will now explain.

  • The LABEL line allows for some metadata that helps examine our images at a later date with some meaningful information with a command called docker inspect.

  • The RUN line is our command execution within the image to add or remove or manipulate the images to our desired state. We can have as many RUN statement lines but we suggest the minimum to get our container to run.

  • The 1st RUN line executes package installs from one line separated by semicolons due to how a docker image is layered. We will explain that in detail later in our guide and why that is important to consider.

  • The 2nd RUN line is adding some content to our web server.

  • The EXPOSE line allows traffic to our web server on port 80 of the container.

  • The ENTRYPOINT line is used to signify the command that will be run as if we were on the command line and this is the main execution process.

  • The CMD line should be used to run the software contained by our image, along with any arguments.

Note: Adding comments is a breeze just by prefacing the line with a # so others can easily understand our intent.

Onto the advanced use case and best practices

Size does matter

In our above example we started with a very basic RHEL 7 image, which should be obtained from a trusted source. Understanding where your images come from is just as important as what is in them. Building a process around what is allowed into your network should be built upon the trust but verify premise. Now as to the images, they vary in size and what we started with (at the time of writing this document) is large in comparison to an Atomic Host image found here ( ~ 29M using 87 RPMs ) vs the rhel7 base image found here ( ~ 71M using 152 RPMs ). Both of these stats of size and how many rpms are actually inside the images can be found, in this case, at https://access.redhat.com within the main pages that describes the image. One attribute to consider is the size of the image. Our design and intent should be taken into account to see which style of image best fits our use case. An additional consideration is Atomic images do not include SUID programs which give temporary permissions to a user to run a program/file with the permissions of the file owner rather that the user who runs it which is an additional consideration that falls under security. The package installer, which is microdnf and not yum, is a package manager that does not have python as a requirement and thus is lighter weight and has a reduction in the rpm’s in the image itself which carries extra benefits as a reduced attack surface as well.

Base image

Once we have landed on our image style, it’s time to consider additional items such: Is there already an image created that serves our purpose or am I constructing this images for reuse for others in our corporation?

Security

Our third consideration to our image standard is security. Security has different areas of concern when implementing container images. We mentioned the trusted source as one such aspect of security but I want to bring to your attention an example of why we must trust the source in which we get our images. It is paramount to make sure that you have full traceability of how the image was created in the first place and what is in that image. Having the backing of a company like Red Hat that ensures vulnerabilities are published, tracked, and fixed provides a full lifecyle that you can trust. Images have the ability to be scanned before they enter into production as well as signed using a process like this, which allows your teams to have confidence that you are not just allowing anything into production. Security does not stop there, we also have to consider who is ultimately going to run our container. When installing packages to the base image we will need to be root to accomplish such tasks, but once those tasks that require root privilege are complete, we should change the running container user with a USER 1001 or similar line in our file before we run the container. One example where that is not possible is our httpd example where we do not use USER 1001 as the last line of the Dockerfile due to the nature of the httpd process requirements of ports and user restrictions of starting.

Storage

Keep storage outside the image definition in the Dockerfile. If you need persistence, do it in a volume and not within the container. Often Dockerfile authors with a background in VM management will reproduce VM definition practices with their Dockerfiles, including accounting for storage volumes, but containers are not VMs. They are small ephemeral lightweight runtimes and should be treated as such. Think small and when we do the resulting outcome should translate to fast startups.

Process of the container

As we have touched on speed, that brings us to the next consideration: don’t run more than one process in our single container unless absolutely needed. This goes back to the small is fast idea and it will reduce the complexity of our container as well. Reusability and maintainability are very important to our efficiency as well as the efficiency of others who can use our image for their needs.

Reduced entries

Each line in the Dockerfile adds a layer to the image. As mentioned above in our RUN command for installing packages, we executed a line continuation with one of the following, a backslash ( \ ) for multiline read ability or a semicolon ( ; ) or double ampersands ( && ) to ensure all those RUN segments commands are bundled into one layer. These concatenated RUN commands should be used when the intent is bound around a common goal such as the installation of packages and the cleaning out of the resulting cache required for the package install. As we see in the example:

RUN yum -y install httpd && yum -y update; yum clean all

Another common procedure to ensure a layer is compact is to run a script referenced outside of the file to accomplish all our desired tasks.

Conclusion

In conclusion we have the following considerations for constructing our image.

  • Size of the base image → Use the smallest trusted base image that satisfies the containerized workloads

  • Which style to use rhel or atomic → For most implementations and quick starts, a base RHEL image will be used. As the platform control plane becomes more containerized the advent of further atomic images will arise

  • Trusted source of the starting point image → The ability to trust the source of our images with backed proof of CVE’s is paramount to our security.

  • Keep our image stateless ( attach a volume when state is required ) → Its ephemeral nature allows for rapid deployment and creation.

  • Run one process per container → In this case less is more, one process per container embraces our desired end goal of a microservices architecture.

  • Design for reusability and maintainability → Let’s not reinvent the wheel by creating images that already exist and secondly share and evangelize your images.

  • Reduce the number of components to execute within our configuration file → Consider using scripts that are called within the image to reduce the layers created.

  • Clean up our image with unwanted cache and build artifacts → A clean image is just good practice, removing unneeded libraries reduce the size, increase usability, reduce the possible attack surface.

Appendix

Dockerfile Advanced Use Case

FROM rhel:7.4
ENV JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk \
KAFKA_VERSION=1.0.0 \
SCALA_VERSION=2.11 \
KAFKA_HOME=/opt/kafka
COPY fix-permissions /usr/local/bin
RUN INSTALL_PKGS="gettext tar zip unzip hostname java-1.8.0-openjdk" && \
  yum install -y $INSTALL_PKGS && \
  rpm -V $INSTALL_PKGS && \
  yum clean all  && \
  mkdir -p $KAFKA_HOME && \
  curl -fsSL https://archive.apache.org/dist/kafka/${KAFKA_VERSION}/kafka_${SCALA_VERSION}-${KAFKA_VERSION}.tgz | tar xzf - --strip 1 -C $KAFKA_HOME/ && \
  mkdir -p $KAFKA_HOME/logs && \
  /usr/local/bin/fix-permissions $KAFKA_HOME
WORKDIR "/opt/kafka"
EXPOSE 9092
USER 1001

Code for fix-permissions script

#!/bin/sh

# Fix permissions on the given directory to allow group read/write of
# regular files and execute of directories.

find $1 -exec chgrp 0 {} \;
find $1 -exec chmod g+rw {} \;
find $1 -type d -exec chmod g+x {} +