Gear Building Tutorial Part 6: The Dockerfile

Introduction

Now that we have our run.py and manifest.json files, we will move on to creating the Dockerfile for our simple gear. A Dockerfile is used to create a self-contained environment in which to run our gear. Using a self-contained environment or Docker image makes it so that anyone can run your gear regardless of the operating system of their local machine. This section is designed only to familiarize you with the components of a Dockerfile and how to use it within the context of a Flywheel gear. The Dockerfile that we will create in this section will provide an overview of the basic code needed to create a simple, functioning Docker image. It does not cover many of the more advanced options needed to build more complicated Docker images.

This Gear Building Tutorial part assumes a base level knowledge of Docker and Docker Hub, including the rationale of why using containerization is important for creating portable, reusable code. If you are unfamiliar with Docker, we recommend you check out the official Docker User Guides before getting started.

Instruction Steps

Gear Building Tutorial Part 6: The Dockerfile
Introduction
Instruction Steps
Step 1. The Dockerfile
Step 2. Dockerfile Anatomy
Step 3. Selecting a base image or operating system
Step 4. Drafting a Dockerfile
Step 5. Drafting the Dockerfile for our example gear
Step 6. Building a Docker image from a Dockerfile
Step 7. Linking the Docker image to the gear manifest
OPTIONAL: Step 8. Uploading your Docker image to Docker Hub
Finishing up with Dockerfiles
Additional Resources
Navigation

Step 1. The Dockerfile

As mentioned above the Dockerfile is need to define a self-contained environment in which to run our gear. The documentation for writing Dockerfiles will be helpful when developing more advanced gears. Generally, a Docker image should play nicely with a Flywheel gear, but there are some specifics to ' how Flywheel expects certain things to be set in the Docker image. Below, we'll provide abbreviated "Docker for Flywheel" instructions, which should get us on our feet with Docker in the context of a simple Flywheel gear.

Step 2. Dockerfile Anatomy

A typical Dockerfile will include some (or all) of the following sections:

A base image or operating system (OS)
Additional packages not included in the base image
Environment variables
Copies of any necessary files/folders for the program to run
An entrypoint

When a Dockerfile is built (i.e. the commands in the Dockerfile are run to create a self-contained environment), it generates an image. This image can be run with Docker, almost like a virtual machine, with the self-contained environment we made.

Step 3. Selecting a base image or operating system

Don't worry, we don't have to install a fresh OS on our local machine or install software packages (beyond Docker) ourselves. Docker and the Docker community have already generated many base images and OS's we can start from. Publicly available Docker images are stored at Docker Hub.

Official Docker Images

The following link will take you to a list of official Docker image releases. It is not necessary to use an official Docker image as a base image for your own Dockerfile, however, it is helpful to know these official images exist. Scrolling through, you may see some familiar operating systems like Ubuntu and CentOS, as well as common programming languages like Python and Go.

To determine which base image or OS to use, consider the following:

Is any of the software we need to install incompatible with any operating systems?
Is there a specific version of a package we need to use (e.g., my script only works with python < 3.12)?
Is our gear using a specific software package (e.g., Freesurfer, FSL, Matlab, etc)?
How can we make the image as lightweight as possible (different base images and OS's have different sizes)?

For the simple gear we are building, we're running a single python script. We could choose a base OS like Ubuntu or Debian. However these OS's come with a lot of additional software that we don't need. We should consider using a base image or OS that is more lightweight or better suited to our specific needs. Instead of starting from a full OS like Ubuntu or Debian, we can start with a lightweight Python base image, such as python:3.12.3-slim-bookworm.

Step 4. Drafting a Dockerfile

Now that we have selected an appropriate base image for our Docker image, we can start creating our Dockerfile. There are a few main commands we can use in Dockerfiles to set up the image:

FROM: Set the base image on which to build the rest of our Docker image. Typically, this will be a existing Docker image name and version tag from Docker Hub.
RUN: Execute a command as if in a regular bash shell. Typically, in Docker, this command is used when downloading, installing, and configuring software. It is also used when setting up the directory structure.
ENV: Set an environment variable in the resulting Docker image
COPY: Copy files from your local machine to the Docker image, so that those files will be available within the image when run.
ENTRYPOINT: Set a fixed command to run when the Docker image is launched.

Before moving on to drafting the Dockerfile for our simple gear, let's walk through the basic steps for creating a Docker image.

To build our Docker image using the python:3.12.3-slim-bookworm container we found above, we can use:

FROM python:3.12.3-slim-bookworm

Since we want our Docker image to be compatible with Flywheel's gear environment, we next can use the following commands to created the needed directory structure and environment:

ENV FLYWHEEL=/flywheel/v0
RUN mkdir -p ${FLYWHEEL}

This will tell Docker to create an environmental variable, FLYWHEEL, and then create the directory structure, /flywheel/v0.

Now that we have set up the Flywheel directory structure inside our Docker image, we can use this command to create a working directory:

WORKDIR ${FLYWHEEL}

The above command, WORKDIR, sets up the /flywheel/v0 directory as the location inside the Docker image where we want to run our gear.

Next, to install a couple of packages we can add these line to our Dockerfile:

RUN apt-get update -qq && \
    apt-get install --no-install-recommends -y git && \
    pip3 install flywheel-sdk

This will tell Docker to first update the list of packages available for installation using apt-get, then to execute the command apt-get install --no-install-recommends -y git to download and install git. The -qq flag supresses any output while updating apt-get. The --no-install-recommends option tells apt-get not to install the recommended dependent packages for git, making the installation smaller. The -y flag automatically answers "yes" to any prompts during the installation of git. If these commands are successful, it will then execute the command pip3 install flywheel-sdk to install the flywheel-sdk package from the Python Package Interface (PyPI).

Then, to copy a file into the FLYWHEEL directory we created above on our Docker image and make sure it can be executed, we can add the following:

COPY run.py ${FLYWHEEL}/run.py
RUN chmod a+x ${FLYWHEEL}/run.py

This will copy our run.py script from our gear directory to /flywheel/v0. When we run this Dockerfile, run.py will always be added to the Docker image in the same location. Once our script is copied inside our Docker image, we make sure the correct permissions are set for the script to be run as an executable.

Finally, we need to specify in the Dockerfile what command we want run when the Docker image is launched. We can do that by setting the entrypoint:

ENTRYPOINT ["python3", "run.py"]

This tells Docker that when the Docker image is launched, run.py should be automatically run.

The manifest file command and Dockerfile ENTRYPOINT

When the gear job is executed in Flywheel, the ENTRYPOINT set in the Dockerfile is overwritten by the "command" defined in the gear manifest.json. Still, it is good practice to set the Dockerfile ENTRYPOINT to the same command we want to run to execute the algorithm.

Step 5. Drafting the Dockerfile for our example gear

Putting all of this together, we can create a Dockerfile (named Dockerfile) in our gear directory and enter the following lines:

# Set base image to python:3.12.3-slim-bookworm using index digest hash to fix version
# This version of python:3.12.3-slim-bookworm has OS: Debian 12 (bookworm) and python: 3.12.3
FROM --platform=linux/amd64 python@sha256:2be8daddbb82756f7d1f2c7ece706aadcb284bf6ab6d769ea695cc3ed6016743

# Flywheel spec (v0)
ENV FLYWHEEL=/flywheel/v0
RUN mkdir -p ${FLYWHEEL}

# Set the working directory
# Note: This is the directory where the gear is run
WORKDIR ${FLYWHEEL}

# 1. Update package list for apt-get
# 2. Use apt-get to install git package, 
#    skipping installation of recommended packages
#    (to keep the image size small)
# 3. Use pip3 to install the flywheel-sdk package
# Note: git is not required for this gear, but it is included 
# as an example of how to install additional package using
# apt-get
RUN apt-get update -qq && \
    apt-get install --no-install-recommends -y git && \
    pip3 install flywheel-sdk

# Copy run.py script to Flywheel spec path
COPY run.py ${FLYWHEEL}/run.py
# Change permissions to make it executable
RUN chmod a+x ${FLYWHEEL}/run.py

# Configure entrypoint
ENTRYPOINT ["python3", "/flywheel/v0/run.py"]

In the above Dockerfile, we first build out image starting with python:3.12.3-slim-bookworm. Instead of just specifying this base image using the <container>:<tag>, we are using the hash address (sha256:2b38dad...). Using the hash address forces Docker to use this specific base image, effectively fixing both the underlying OS and python version of the python:3.12.3-slim-bookworm base image. In some cases, using the hash is preferable to using the tag, as Docker container maintainers can sometimes re-use tags or push updates to an already published tag on Docker Hub. The hash ensures that we (and anyone else using our gear) will always be running the exact same version.

We also added the --platform=linux/amd64 option to the base image FROM statement. This will ensure that we build our container in the correct linux environment that Flywheel expects. If you are completing this tutorial on a local linux machine, then you can omit this option. However, if you are completing this tutorial on a Mac, then you will need to include this option to ensure that your gear builds correctly and can be run on a Flywheel instance.

Next, we specify the Flywheel v0 environment variables and directory structure, so that our gear will be organized the way that the Flywheel engines expect. We also specify a working directory (WORKDIR), where the gear will be run.

Then, we use apt-get to install the git package. Strictly speaking, we do not need git to run our example gear, but we include it in our Dockerfile as an example of using apt-get to install a package. As part of this RUN block, we also use pip3 to install the flywheel-sdk package. We specify pip3 instead of just pip to ensure that flywheel-sdk is installed and linked correctly to python3.

Then, we copy our run.py script from our local gear directory into the Docker image and use the RUN command to make the script executable.

Finally, we set our entrypoint to use python3 to run our run.py script.

Step 6. Building a Docker image from a Dockerfile

At this point we are ready to build our Docker image using the Dockerfile we created in the previous section.

To build our Docker image locally, we can use the docker build command:

docker build -t <Docker_Hub_Account>/<gear_name>:<gear_tag> <path/to/gear/Dockerfile>

First, navigate to the directory where our gear is stored and open a terminal window there. We will then build this image with a name and a tag, using the -t option. Since we've versioned our gear 0.1.0, we should give our docker image the tag 0.1.0 as well.

For our example gear, we can use the following command to build a local copy of our Docker image.

docker build -t homer/hello-world:0.1.0 ./

Since we are not planning to upload this image to Docker Hub during this tutorial series, we are using a dummy Docker Hub account name of homer. Feel free to use your own Docker Hub account name if you wish.

Docker image naming and tagging in Flywheel

While the image name and tag can be anything we want, for Flywheel gears, it is highly recommended to set the name to the gear name and the tag to the version number. Using these naming and tagging conventions will make it a lot easier to keep track of which Docker image belongs with which gear version. The version number in the manifest and for the Docker image should always match. This may mean creating a new Docker image with a new version number to match what is in the manifest, even if no changes were made to the Dockerfile and vice versa.

This image now exists on your computer, labeled homer/hello-world:0.1.0. This will allow it to run locally on your machine.

Every time you make changes to either the Dockerfile or your algoritm, you will need to re-run the docker build command above. Typically, if the gear has not been uploaded to your Flywheel instance, you can keep using the same version. If you need to increment the version of the Docker image, you should also change the version in the manifest.json, both under "version" and under "custom.gear-builder.image".

Step 7. Linking the Docker image to the gear manifest

Now that we have a working Docker image, there is one more thing we need to do. Remember that "image" key under the "custom" tag in the manifest? We now need to set it to the image we're using.

Open up the manifest.json file and set the "custom" -> "gear_builder" -> "image" tag to the name of the Docker image we just built:

"custom": {  
   "gear-builder": {  
      "category": "analysis",  
      "image": "homer/hello-world:0.1.0"
      ...

While we are checking tags in our manifest file, we should double-check that the "version" tag matches the version tag we set when we built our local Docker image in the previous step.

{
  ...
  "description": "My very first gear, developed along with the flywheel gear building tutorial.",
  "version": "0.1.0",
  "author": "Flywheel User",
  ...

We now almost ready to test the gear. Our current gear directory structure should look like this:

GearTutorial  
|- run.py  
|- message.txt  
|- manifest.json  
|- Dockerfile

OPTIONAL: Step 8. Uploading your Docker image to Docker Hub

For this tutorial series, we will skip the below step of uploading our new hello-world:0.1.0 Docker image to Docker Hub. Within the context of Flywheel gears, we only really need to upload our Docker image to Docker Hub if we intend to share our gear with others. However, if we want to upload our new Docker image to our Docker Hub anyway, we can use the following simple command:

docker push <Docker_Hub_name>/<gear_name>:<gear_tag>

For our example and assuming we had a Docker Hub account set up under the name "homer", the docker push command would look like (again we do NOT need to run this step):

docker push homer/hello-world:0.1.0

To learn more about pushing images to Docker Hub, check out the official Docker documentation.

Finishing up with Dockerfiles

In this part, we have learned about the basic anatomy of a Dockerfile, how to create a simple one for our example gear, and how to use docker build to build a local copy of our Docker image. We also learned the important step of updating the manifest.json file with our Docker image name and tag, so that our example gear knows to use our Docker image to run our run.py script. In Part 7: Running a Gear Locally, we will walk through how to run our gear locally using Flywheel's new CLI fw-beta, as well as some debugging techniques.

Additional Resources

Previous: Part 5: The Manifest

Next: Part 7: Running a Gear Locally