Gear Building Tutorial Part 8: Logging

Introduction

In the previous parts of this Gear Building Tutorial, we have covered the minimum needed to build a functioning Flywheel gear that will run on our Flywheel instance. In this part, we will cover adding logging to our gear. Logging is simply a way to keep track of events that occur during a gear run, such as loading in or writing out a file, running a command with specific options, errors, warnings, etc. Logs are especially handy for tracking progress and key in troubleshooting errors.

Logging in Flywheel gears

When a Flywheel gear is run, a gear job and its associated log are created. If a gear fails, the error traceback is typically (sometimes niche errors prevent the error from being logged) reported in the log automatically.

Instead of using print statements, the built-in python module logging is recommended. A minimum logging should be set; that level and all levels above will be reported (e.g., if INFO is set, DEBUG will not be reported). The table below shows all logging levels.

Level name	Numeric value
`CRITICAL`	50
`ERROR`	40
`WARNING`	30
`INFO`	20
`DEBUG`	10
`NOTSET`	0

The numeric values can used to set the logging level when initializing the python logging module:

import logging

logging.basicConfig(level=20)

However it is preferable to use the level names to make your code easier to read and maintain:

import logging

logging.basicConfig(level=logging.INFO)

In the context of Flywheel gears, the default logging level is set to INFO, and the lowest level of logging that can be used is DEBUG. It is not recommended to use NOTSET to avoid creating excessively large log files.

Initialize logging using Flywheel SDK

Instead of initializing logging through the python module directly, we can use the Flywheel SDK to initialize the standard logger for our gear.

import flywheel
import logging  # <-- Add import statement for python logging module

# Get the gear context
context = flywheel.GearContext()

# Initialize logging and create log  
context.init_logging() 
log = logging.getLogger()  

If you have other modules/files that get called from the run script, the initialized logger can be imported in the following way:

import logging  

log = logging.getLogger(__name__)

Logging Information

In our current run.py file, we use print statements to pass information to the terminal window. With logging, we can pass these same statements on the progress of our gear to the INFO or other levels of our log instead.

For example in the while loop that prints out our "Hello" statement, we could add logging to keep track of what iteration we are on, as well as pass our "Hello" statement to the log instead of the terminal:

# While the num_rep variable is greater than zero:
while (num_rep > 0):
    log.info("Starting loop {}".format(num_rep))                               
    with open('hello.txt', 'a') as f:               # Open the file hello.txt with the intent to append             
        f.write("Hello, {}!\n".format(my_name))     # Write "Hello Name!" to the file every loop      
    log.info("Hello, {}!".format(my_name))          # Print "Hello Name!" to terminal every loop           
    num_rep -= 1                                    # Decrease the num_rep variable by one     

By using log.info(), we are telling the python logging module that we want these statements to be logged at the INFO level and to be visible if the logging level is set to INFO or DEBUG.

If we were to update our existing run.py file and then rebuild and re-run our gear locally, we would see that the output to our terminal window now looks like:

Logging information output

INFO:root:Starting loop 5
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 4
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 3
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 2
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 1
INFO:root:Hello, Maggie Simpson!

Logging Warnings

In addition to logging information statements, we can also log gear behavior that is unexpected but will not cause our algorithm to fail. In this case, we can simply change the logging level from info to warning. As an example, let's consider the case where the message.txt file supplied as input to our gear is empty. Our gear will still run, but we might want to let the user know that the input file they provided was empty.

In this case, we can update the section of our run.py that prints out the contents of the input message file to first check if the input file is empty and log a warning if it is:

# Now read the custom message:
message_file = open(message_file,'r')       # Open the file with the intent to read
if message_file.read(1):                    # Check if message_file is empty
    message_file.seek(0)                    # Move the cursor back to the beginning of the file                
    print('\n')                             # Print a blank line to separate the message from the "hello's"                                         
    print(message_file.read())              # Read and print the file contents
else:
    log.warning("Input message file is empty")      # Log a warning message if the message file is empty

If we were again to re-build and re-run our gear locally with our updated run.py and then supply an empty message.txt file as input, our logging should now read:

Logging warning output

INFO:root:Starting loop 5
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 4
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 3
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 2
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 1
INFO:root:Hello, Maggie Simpson!
WARNING:root:Input message file is empty

Logging Errors

The last level of logging we will cover is logging errors. An error can be considered any scenario we know will cause our script to not work. In these instances, we can also have python raise an error and stop our script. Let's expand on our previous example of logging a warning, but this time force our script to stop with an error for an empty input message.txt file.

# Now read the custom message:
message_file = open(message_file,'r')   # Open the file with the intent to read
if message_file.read(1):                # Check if message_file is empty
    message_file.seek(0)                # Move the cursor back to the beginning of the file               
    print('\n')                         # Print a blank line to separate the message from the "hello's"                                          
    print(message_file.read())          # Read and print the file contents
else:
    log.error("Input message file is empty")            # Log an error message the input file is empty
    raise ValueError("Input message file is empty")     # Terminate script

Again, if we re-build and re-run our gear locally with the error logging updates to run.py, then our log now looks like:

Logging error ouput

INFO:root:Starting loop 5
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 4
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 3
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 2
INFO:root:Hello, Maggie Simpson!
INFO:root:Starting loop 1
INFO:root:Hello, Maggie Simpson!
ERROR:root:Input message file is empty
Traceback (most recent call last):
  File "/flywheel/v0/run.py", line 55, in <module>
    raise ValueError("Input message file is empty")
ValueError: Input message file is empty
...

Adding logging config option to manifest file

If you have run any gears on a Flywheel interface, you have probably noticed that nearly all gears have a config option to change the logging level from the default INFO to DEBUG. We can update our gear's manifest.json file to include this option:

In the "config" block, we can add a new boolean (True|False) config option called "debug" and set the default value to False:

...
"config": {
    "my_name": {
      "description": "Your name, for the program to say 'hello' to",
      "type": "string"
    },
    "num_rep": {
      "description": "The number of times to say 'hello'.  Min = 0, Max = 10",
      "default": 5,
      "minimum": 0,
      "maximum": 10,
      "type": "integer"
    },
    "debug": {
      "default": false,
      "description": "Log debug messages",
      "type": "boolean"
    },
  },
...

Too much logging

Like many things in life, moderation is key. Too much logging can actually overwhelm part of the Flywheel system, and that system can cause other gears to fail if it's not working properly. In addition, logging too much can also make it much more difficult to troubleshoot issues, since one now has to mine for data instead of quickly scanning through the log files. Finally, depending on the version of your Flywheel instance, long log files run the risk of being automatically truncated. The goal is to log enough information to be useful but not so much as to be overwhelming to either the user or the Flywheel system.

How do I know if I'm logging too much?

Log files are recommended to stay below 1MB (and ideally in the < 10KB range). As a reference, a full text-only copy of Don Quixote is approximately 2.5MB, so 1MB gives a lot of head room to adequately log a typical gear.

If we look back through the examples of different logging levels we just walked through, we can see that maybe we didn't need to log the start of every "Hello..." loop. For the relatively low number of repetitions we chose, logging every loop isn't too burdensome. However, you can envisage an iterative process that may repeat a loop tens to hundreds of times before converging. In this case, logging every loop could make it difficult to read through and parse the log. So, it might be more prudent to log just the start of the looping process or only log every N number of loops.

As you start creating and debugging your own gears, it hopefully should become apparent what is important to log for your script/algorithm and what ends up just being unnecessary noise.

Wrapping up

In this part, we have learned the basics of logging, including how to use the Flywheel SDK to initiate a python logger and how to log information at different levels. In the final part, we will cover the basics of how to get started debugging a failed gear run.

Previous Part 7: Running a Gear Locally

Next Part 9: Debugging