Gear Building Tutorial Part 9: Debugging
Introduction
In the previous parts of this Gear Building Tutorial, we have covered the minimum needed to build a functioning Flywheel gear that will run on our Flywheel instance. We also introduced the concept of logging. In this last part, we will cover the basics of how to get started with debugging our gear. Rather than continually re-running the gear each time we push an update, we can debug in a more focused and efficient way by leveraging IDE tools. We will focus our examples on using the IDE VSCode, but the general workflow should be similar for other IDEs.
Instruction Steps
- Overview: Context and Rationale
- Step 1. Pulling a gear job from a Flywheel instance
- Step 2. Add API key to gear config file
- Step 3. Launch gear in interactive mode
- Step 4. Attach debugger to running Docker container
- Step 5. Add breakpoints
- Step 6. Incorporate fixes into new gear version
Context and Rationale
If you are just starting out with developing code and gears, the idea of formally debugging your code can seem daunting. Initially it may seem simple and appealing to iterate on your gear development by looping as follows: 1) build a gear, 2) upload and run gear, 3) fix any resulting errors, 4) repeat. There are several drawbacks to this method. Each change you make to any of the files in a gear results in you needing to building and uploading a new version of the gear to your Flywheel instance. Not only is this time consuming, it results in many versions of the gear being stored due to how Flywheel handles provenance. In Flywheel, it is not possible to overwrite a version of a gear, only increment in version.
The better way to debug gears is to run the gear locally as if you were running on the Flywheel instance. This enables the use of common debugging tools and allows for re-running as many times as needed to fix any bugs or finish development without the need to rebuild your gear and upload to your Flywheel instance. The rest of this tutorial describes how to debug your gear efficiently in a local environment.
As we walk through the next steps for debugging a gear, we will be making the assumption that we are developing our gear in an IDE. Using an IDE can make debugging a bit more user friendly, compared to using pdb
directly from the command prompt, for example.
Specific examples follow VSCode workflow
For the examples of the debugging workflow inside a Docker container, we will be using VSCode. While the specifics may be different for other IDEs, the general workflow should be similar.
Step 1. Pulling a gear job from a Flywheel instance
While it is possible to do a fair bit of the initial gear debugging locally before we even upload the first version of our gear to our Flywheel instance, ultimately, we do need to test our gear in a Flywheel instance. Let's say that either our gear failed or maybe did not produce the desired behavior (e.g., missing or incorrect output). Instead of trying to recreate this failed gear run on our local machine from scratch, we can use a fw-beta
command to pull the failed job and all of the associated files directly from our Flywheel instance. This way, we will have the exact same gear Docker container, input(s), and config options that were used to run the gear on our Flywheel instance.
Before we can pull our failed gear job, we need to know what the job ID was. There are a few places in the Flywheel interface where we can find the job ID for a gear run, but the most user friendly way is to navigate to the Jobs Log menu from the left-hand sidebar. Then we can search for the name of our gear and select our gear run from the filtered list. A box should pop-up on the right hand side of the main interface window. If we click on the Log tab, our job ID will be listed in the header above the log dump.
Now that we know the job ID for the gear run we want to pull, we can use the following command to download the zip file containing the gear run:
This command will download a directory with the following filename pattern: <gear-name>-<version>-<job_ID>
It is usually most convenient to download this job directory inside our gear directory, however, you can download and store this folder whereever it is most convenient.
If we check the contents of the downloaded job directory, we should see a tree structure similar to below:
hello-world-0.1.0-12345e01928387293d0/
|- config.json
|- input/
|- manifest.json
|- output/
|- work/
Recalling from Part 7, the input
folder will contain individual subfolders containing each input file and the output
folder will contain any output produced during our gear run.
Step 2. Add API key to gear config file
Before we can start debugging our failed gear job on our local machine, we need to update the config.json
file to add our api key. Remember from Part 7, that the config.json
file contains all of the inputs and options specified in the manifest.json
file, as well as, additional information Flywheel needs to run our gear.
Assuming we have our api key saved as an environmental variable called, $MY_API_KEY
, we can run from a terminal prompt:
You can open up the config.json
file to verify that the api_key
field has been correctly populated.
At this point we could try and re-run the failed gear job directly using fw-beta gear run
, however, to take advantage of our debugging tools, we will need to develop from inside the gear itself.
Step 3. Launch gear in interactive mode
In order to start the debugging process and be able to set breakpoints and step through our code in order to find the source of the error, we need to run our failed gear job interactively.
From a terminal prompt inside VSCode, run:
At this point, we are now inside the running Docker container for our gear.
Mounting local folders and files inside Docker container for a gear
Similar to Docker, we can use the -v
flag with fw-beta gear run
to mount a folder or file on our local machine inside the Docker container for our gear. See Step 6 for details.
Step 4. Attach debugger to running Docker container
Installing Remote Development extension
If this is your first time working with containers in VSCode, you will want to install the "Remote Development" extension package before proceeding."
Within VSCode, attach to the running Docker container by clicking on the blue double-arrow (><
) icon in the lower left-hand corner of the window and then selecting "Attach to Running Container..." from the pull-down menu that appears.
Then, select the container associated with your gear from the list of running containers. A new VSCode window should open.
Next to access the gear-related files in the container, you need to set the starting directory to /flywheel/v0/
.
Finally, open the run.py
file and either install the Python extension from the VSCode extension pane or accept the VSCode prompt to install the Python extension.
Using PyCharm?
Check out the PyCharm documentation and the this video walk-through.
Step 5. Add breakpoints
Once you are inside the /flywheel/v0
directory, you can start adding breakpoints to the run.py
. Breakpoints act as stop signs, pausing the gear run when it reaches a breakpoint. While the gear is paused, you can look at the contents and values of variables and directly test running python commands on variables. Depending on the IDE you are using, you can also set breakpoints in functions called in the run.py
by navigating to the declaration of these functions.
Where should I set breakpoints?
Information in the log file (e.g., function where error occured) can help you decide where to set breakpoints.
Once you've set at least one breakpoint, navigate to the "Run and Debug" menu and click on the blue "Run and Debug" button. From here, VSCode will prompt you to select a debugger (choose "Python Debugger" from the pull-down menu) and a debug configuration (choose "Python File" from the pull-down menu).
To start debugging, make sure that the run.py
is the active python file and then click on the green play-button arrow. Your gear code will start running until it reaches the first breakpoint. Once the gear code stops at the breakpoint, you can use the VSCode tools to step through and over functions and check on local and global variables. You can also test out python commands and further query variables from the "Debug Console" tab.
Step 6. Incorporate fixes into new gear version
Key, especially, if you are using VSCode, is to make sure that you copy any changes you make to your code to fix the error inside the container back to your gear code outside the running Docker container.
You can also add one more option to the command that interactively launches the failed job inside the Docker image. In the case of our example hello-world
gear, we can use the -v
option to link the run.py
script in our local gear directory to the one inside the gear Docker container.
This -v
(or --volume
) option can be used to link a file or a folder located on your local machine to the equivalent file or folder inside the gear Docker container. You can also include multiple -v
options to link multiple, different files and/or folders. The key with using this option is to make sure that you are linking to the correct location insdie the /flywheel/v0
directory hierarchy.
Once you are sure that you have made all of the necessary changes to your gear code, you can exit the running container by clicking back on the blue arrow menu in the lower-left hand corner of the debugging session and selecting "Close Remote Connection" from the pull-down menu.
Now that you have fixed the error in your gear code, you can repeat the process we walked through in Part 7 and rebuild and re-run your gear. Remember to update the version number in the manifest file for both the gear and Docker image.
Wrapping up
In this part, we have learned the basics of debugging, focusing on the specifics of the workflow to attach to and debug inside a running Docker container in VSCode. You should now have the introductory information you need to get started with building your own gears, adding logging, and debugging errors.
Navigation
Previous: Part 8: Logging