Building Container Image inside Container using Buildah
Table of Contents
This post explains how we build a container image inside a container, isolating all dependent packages into the container.
The introduction below clearly shows why it is required.
Lots of people would like to build OCI/container images within a system like Kubernetes. Imagine you have a CI/CD system that is constantly building container images, a tool like Red Hat OpenShift/Kubernetes would be useful for distributing the load of builds. Until recently, most people were leaking the Docker socket into the container and then allowing the containers to do docker build. As I pointed out years ago, this is one of the most dangerous things you can do.
Best practices for running Buildah in a container, Daniel Walsh
I totally agree with the idea to implement a container image in a container, and mounting the Docker socket (/var/run/docker.sock
) into the container to use docker build
command would not a good idea.
Instead, I will use buildah
.
Two Red Hat developer blog posts are already out 1 2, and this post summarizes the two posts and introduces step-by-step instructions.
1. Creating a Dockerfile and Building a Container Image #
You can simply use the existing buildah container image and skip this subsection:
$ docker pull quay.io/buildah/stable
FROM centos:8
# Remove directories used by dnf that are just taking up space.
RUN dnf -y install buildah fuse-overlayfs; rm -rf /var/cache /var/log/dnf* /var/log/yum.*
# Adjust storage.conf to enable Fuse storage.
RUN sed -i -e 's|^#mount_program|mount_program|g' -e '/additionalimage.*/a "/var/lib/shared",' /etc/containers/storage.conf
Here, we cannot use nested overlayfs (running a overlayfs based container inside a overlayfs based container is not possible), we use fuse-overlayfs in a inner buildah container. According to 1, fuse-overlayfs still gives better performance than VFS storage driver.
We use the fuse-overlay program inside of the container rather than using the host kernel overlay. The reason is that, currently, kernel overlay mounts require the SYS_ADMIN capability, and we want to be able to run our Buildah containers without any additional privileges than a normal root container for image construction. Fuse-overlay works quite well and gives us better performance than using the VFS storage driver. Note that using Fuse requires people running the Buildah container to provide the /dev/fuse device.
Best practices for running Buildah in a container, Daniel Walsh
Note that, 1 explicitly indicates to exclude container-selinux
package when installing buildah
, however, it returns an error saying that buildah
requires container-selinux
.
$ docker build -t buildahimage -f ./Dockerfile .
[+] Building 5.0s (5/6)
=> [internal] load .dockerignore
=> => transferring context: 2B
=> [internal] load build definition from Dockerfile
=> transferring dockerfile: 330B
=> [internal] load metadata for docker.io/library/centos:8
=> CACHED [1/3] FROM docker.io/library/centos:8
=> ERROR [2/3] RUN dnf -y install buildah fuse-overlayfs --exclude container-selinux
------
> [2/3] RUN dnf -y install buildah fuse-overlayfs --exclude container-selinux:
#5 1.429 CentOS-8 - AppStream 5.6 MB/s | 5.8 MB 00:01
#5 3.392 CentOS-8 - Base 2.5 MB/s | 2.2 MB 00:00
#5 4.416 CentOS-8 - Extras 14 kB/s | 8.1 kB 00:00
#5 4.888 Error:
#5 4.888 Problem: package buildah-1.11.6-7.module_el8.2.0+305+5e198a41.x86_64 requires container-selinux, but none of the providers can be installed
#5 4.888 - conflicting requests
#5 4.888 - package container-selinux-2:2.124.0-1.gitf958d0c.module_el8.2.0+303+1105185b.noarch is filtered out by exclude filtering
#5 4.888 - package container-selinux-2:2.124.0-1.module_el8.2.0+304+65a3c2ac.noarch is filtered out by exclude filtering
#5 4.888 - package container-selinux-2:2.124.0-1.module_el8.2.0+305+5e198a41.noarch is filtered out by exclude filtering
#5 4.888 (try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
------
executor failed running [/bin/sh -c dnf -y install buildah fuse-overlayfs --exclude container-selinux]: runc did not terminate successfully
Now you can build a image with the Dockerfile:
$ docker(or podman) build -t buildahimage -f ./Dockerfile .
2. Running a Buildah Container #
Now we can run a built image. There are several arguments used in the blog posts 1 2:
docker(podman) run
: run a new container based on the given image.--device /dev/fuse:rw
: mount/dev/fuse
device into container, so thatbuildah
in a container can use it to run an inner container.--security-opt seccomp=unconfined
: Docker, by default, restricts usingunshare
system call inside a container, therefore implementing a new namespaces are prohibited in all containers. This flag would allowunshare
system call in containers to makebuildah
work properly.--security-opt apparmor=unconfined
: Thanks to Timothy Wolff-Piggott! He left a comment that this flag is required for proper operations. Check this issue for detail.--security-opt label=disable
: it is given in 1, but it seemsbuildah
in an inner container works properly without this option. Have no idea what it means. Need to study more with the Docker documentation.-v /var/lib/mycontainer:/var/lib/containers:Z
: this mount option also comes from 1. I think its purpose is to store built images into/var/lib/mycontainer
in the host. However, if I use my private repository and push the generated images directly to the repository, this mount will not be necessary. I will not use this mount option for further instructions.
$ docker(or podman) run -it --device /dev/fuse:rw --security-opt seccomp=unconfined --security-opt apparmor=unconfined buildahimage bash
[root@b84150497970 /]# buildah
A tool that facilitates building OCI images
Usage:
buildah [flags]
buildah [command]
...
You can also use vfs instead of fuse-overlayfs with
buildah --storage-driver=vfs
in a container. In this case, you do not have to mount/dev/fuse
device into a container during creation. Both ways do not require a privilege (CAP_SYS_ADMIN).But according to comments in an issue, vfs has a problem: each layer copies the entire contents of the sublayer, taking a huge amount of space and being slow.
Now let’s make a container image!
3. Building a Container Image using Buildah #
$ buildah from centos:8
Getting image source signatures
Copying blob 3c72a8ed6814 done
Copying config 0d120b6cca done
Writing manifest to image destination
Storing signatures
centos-working-container # this is a container ID.
[root@b84150497970 /]# buildah containers
CONTAINER ID BUILDER IMAGE ID IMAGE NAME CONTAINER NAME
6e73844da6b6 * 0d120b6ccaa8 docker.io/library/centos:8 centos-working-container
The command above creates a temporal container based on centos:8 image.
There are two ways to customize the generated container:
- Mount the container rootfs and customize it with
buildah mnt
- Run the container and customize it in itself with
buildah run
Using buildah mnt
#
This is instructed in the buildah blog post 3. It was written two years ago, hence it may not work now. I think it is an old way, I only summarize the commands here. Refer to the post for detail explanations.
$ containerid=$(buildah from scratch)
$ scratchmnt=$(buildah mount $containerid)
$ dnf install --installroot $scratchmnt --release 23 buildah -y # This will install buildah in the container rootfs.
$ buildah config ...
$ buildah commit $containerid buildah
Using buildah run
#
buildah run
runs a container.
Red Hat says:
For one last piece of fun, let’s see if we can run a Buildah container within this Podman container using our modified Buildah code.
But I could run a container without any buildah code modification, meaning the functionality seems to be merged into buildah and I can use it as well.
buildah run --isolation=chroot centos-working-container ls /
bin dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var
After editing the container, you can use buildah commit
to make an image.
For example, I would like to add an empty file /a
in centos:8.
$ buildah from centos:8
$ buildah containers
CONTAINER ID BUILDER IMAGE ID IMAGE NAME CONTAINER NAME
6e73844da6b6 * 0d120b6ccaa8 docker.io/library/centos:8 centos-working-container
$ buildah run --isolation=chroot centos-working-container bash
bash-4.4# touch /a; exit
$ buildah commit centos-working-container custom-centos-image
Getting image source signatures
Copying blob 291f6e44771a skipped: already exists
Copying blob d10629969f66 done
Copying config e4f6ddcbae done
Writing manifest to image destination
Storing signatures
e4f6ddcbae44f65ea9055dcd43b53e2d7334c7437a81a8cb01b60a0ad99dd420
$ buildah images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/custom-centos-image latest e4f6ddcbae44 2 minutes ago 222 MB
docker.io/library/centos 8 0d120b6ccaa8 3 months ago 222 MB
$ buildah inspect custom-centos-image
...
"History": [
{
"created": "2020-08-10T18:19:49.200589992Z",
"created_by": "/bin/sh -c #(nop) ADD file:538afc0c5c964ce0dde0141953a4dcf03c2d993c5989c92e7fee418e9305e2a3 in / "
},
{
"created": "2020-08-10T18:19:49.654025965Z",
"created_by": "/bin/sh -c #(nop) LABEL org.label-schema.schema-version=1.0 org.label-schema.name=CentOS Base Image org.label-schema.vendor=CentOS org.label-schema.license=GPLv2 org.label-schema.build-date=20200809",
"empty_layer": true
},
...
4. Pushing the Generated Image #
I am running a private Docker registry server in my local machine. Refer to the Docker document to deploy your registry server.
$ buildah push --tls-verify=false localhost/custom-centos-image 127.0.0.1:5000/custom-centos-image:8
The command above will push the image as custom-centos-image:8
. To check whether it is actually pushed to the registry server, run:
$ curl -X GET http://127.0.0.1:5000/v2/_catalog
{"repositories":["custom-centos-image"]}
$ curl -X GET http://127.0.0.1:5000/v2/custom-centos-image/tags/list
{"name":"custom-centos-image","tags":["8"]}