Flywheel Notebooks User Guide

Introduction

Flywheel Notebooks provide an interactive compute option that allows researchers to easily apply Python data science tools within the Flywheel platform.

Flywheel integrates with the familiar JupyterLab/Notebooks to combine Python code, reports, and documentation in a single document.

The notebook servers are managed within Flywheel Projects and controlled via User permissions. The compute resources are configurable and managed by Flywheel.

Flywheel Notebooks have a wide array of applications, including the following:

Data and Image Curation
ML and AI Training and Inference
Analytics and Reporting
Automate Administrative Functions

Note

Jupyter Notebooks is an optional Module provided by Flywheel for Machine Learning in Medical Imaging. If you would like more information about this feature, contact your Flywheel Account Executive either directly or through Flywheel support by submitting a ticket or emailing us at support@flywheel.io.

JupyterLab Workspace

Flywheel Notebooks are enabled in the Project settings under 'Workspaces'. Once enabled project users with correct permissions can use Flywheel Notebooks.

The Workspace is used to manage the JupyterLab servers and also maintain the home directories of JupyterLab servers after the server is de-provisioned. Each Flywheel project can have one or more JupyterLab Servers. All users with access to the project and with the permission to Manage Workspaces can create, modify, start, or stop any of the Servers. Team members can edit notebooks and other files and then publish those to Flywheel to share work.

Enabling the JupyterLab Workspace

As an administrator of a project, you are able to control whether or not you want your project to be able to be copied. In order to provide the ability to copy your project, you will want to go to your Project level menu and select “Settings.”

Once in project settings, you will see a "Workspaces" section. Within this section you will turn on the toggle to enable the JupyterLab Workspace for the project.

When this Workspace is enabled for the project:

The list of created JupyterLab servers is shown (or a splash page if none exist)
Allows users with appropriate permissions to create, start, stop, edit, download and delete the JupyterLab server records in the list.

When this Workspace is disabled for the project:

All JupyterLab Server VMs are stopped.
Users are no longer allowed to create, start, stop, edit, download and delete the JupyterLab server records in the list, even with permissions to do so.
Existing server data and file storage is retained in Flywheel. Enabling the Workspace brings access back.

Note

Modifying this setting requires the permission: Manage Project Settings.

Quick Guide for Using Flywheel Notebooks

From the Flywheel project, select the "Workspaces" Tab.

Next, select Create JupyterLab Server on the far right of the screen.

Screenshot 2023-08-22 at 7.56.17 AM.png

Now, name your new server. Optionally an external storage can be selected. For more information see the section Using Flywheel Exports with a Flywheel Notebook

Screenshot 2023-08-22 at 7.59.50 AM.png

From the server, Launch JupyterLab. There is the option of selecting one of the configured compute resources.

Screenshot 2023-08-22 at 8.05.35 AM.png

After the server has been launched, the following message will appear,as shown in the image below.

Screenshot 2023-08-22 at 8.07.23 AM.png

Now click Open on the created server.

Screenshot 2023-08-25 at 11.51.06 AM.png

Once a selection is made, a new compute resource will be started. A new tab will appear for JupyterLab.

Screenshot 2023-08-22 at 9.29.20 AM.png

One can now use JupyterLab to create a new notebook.

Screenshot 2023-08-22 at 1.53.23 PM.png

Once a Notebook has been created/named, you may save the notebooks and other files as required.

Screenshot 2023-08-22 at 1.55.34 PM.png

It may prompt for the user to rename the file as shown in the image below.

Screenshot 2023-08-22 at 1.56.54 PM.png

Note

When you are finished working with your session, select Publish to Flywheel to preserve your work and allow future collaboration on the Jupyter home directory. For more information, see the section Publish to Flywheel

Screenshot 2023-08-22 at 1.53.23 PM.png

The notebook created will now display on the far left hand screen in list form as shown in the image below. Your JupyterLab session can now be closed.

Screenshot 2023-08-22 at 1.59.24 PM.png

Back on the JupyterLab Workspace, you can click "Stop" to shut down the server and free up the compute and storage resources.

The JupyterLab Workspace

JupyterLab Servers are managed within the JupyterLab Workspace in Flywheel. This provides a consistent process for each project to enable, create, and manage JupyterLab servers, storage of notebooks and other files, and manage user access to the Workspace. The records in the Workspace are used to manage the JupyterLab servers and also maintain the home directories of JupyterLab servers after the server is de-provisioned.

JupyterLab Workspaces are an optional Module provided by Flywheel, and so it may not be available for your site.

For each project, the JupyterLab Workspace must be enabled first, by a user who has manage project settings permissions.

Each Flywheel project can have one or more JupyterLab Servers. All users with access to the project and with the permission to Manage Workspaces can create, modify, start, or stop any of the Servers. This approach allows a project team to collaborate on a set of Servers containing Jupyter Notebooks.

Introduction to JupyterLab Servers in Flywheel

Users have shared access over the JupyterLab servers. They allow team collaboration on the server contents by providing project team members with appropriate access to any server. Team members can edit notebooks and other files and then publish those to Flywheel to share work.

The JupyterLab workspace, once enabled, will be activated for a project. From the Flywheel project, select the "Workspaces" Tab.

The JupyterLab workspace is available on this tab.

Creating a JupyterLab Server

A new JupyterLab Server can be created by clicking the "Create JupyterLab Server" button.

Screenshot 2023-08-22 at 7.56.17 AM.png

Screenshot 2023-08-22 at 7.59.50 AM.png

Name For each server you can provide a name to recognize the server in the list. This can be edited in the future. In addition you can provide an external storage that will be made accessible from within the server's Jupyter Notebook.

External Storage Flywheel's Project Exports can also be used to make data in Flywheel accessible to the notebook via cloud storage. If one or more storage locations exist for this project, one can optionally be selected now, or in the future. In the notebook, this storage will be accessible for read and write (if so configured). This requires setting up an external storage provider at the site, group or project level. For more information on using this storage in the notebook see the section Using Flywheel Exports with a Flywheel Notebook

Then click the "Create" button to add a new row showing this new JupyterLab server.

Info

JupyterLab servers only use compute resources when they are started or running. The Status column indicates the state of the server.

JupyterLab servers list the user who initially created the server, however any user with appropriate permissions in their role can act on and use a server they did not create.

Editing a JupyterLab Server

You can edit the JupyterLab server's name and storage settings by using the ellipsis menu:

edit

Stopping a Server

Stopping a server causes the server to be stopped and server storage to be deleted. The Flywheel storage remains so that you can start a new server and work where you left off. The user should remember to use the JupyterLab File menu item "Publish to Flywheel" before stopping a server to ensure work is saved to the Flywheel storage.

edit

Tip

Before stopping or deleting another user’s Server, the best practice is to contact a user before acting on their server so that they do not lose their work.

Deleting a Server

Users with appropriate permissions can delete a Server to avoid wasting Flywheel storage resources. To do this, follow the steps below:

Go to one of the servers in the list and confirm the server is stopped. Click on the ellipsis menu for the server.

edit

Choose the delete menu option. This will remove the server from the list. Deleting a server deletes the Flywheel storage and only users with the ‘Delete’ permission can delete servers.

edit

JupyterLab Server Status

Servers show the status of the JupyterLab servers, indicating which are running and who last published the Server contents back to Flywheel.

Status

Stopped - The JupyterLab server is de-provisioned and is not consuming any VM resources (compute, RAM, or server storage). The server can be started, or deleted.

Starting - The JupyterLab server VM is being requested and being provisioned as specified by the user's compute selections. Depending on the compute selections and VM availability, this may take 10-15 minutes. To stop or delete, users need to wait until the server is either in a Running or Stopped state.

Running - The JupyterLab server is provisioned as specified by the user's compute selections and is running. You can open it for use, or stop it. To delete, the server, it must be stopped first. The status also indicates the user who started the server.

status

Downloading and importing Server Contents

JupyterLab Servers allow users to download a zip archive of the Server home directory contents. This file can be used to share the contents with other researchers, or transfer to another project.

To download and import server content, please follow the workflow instructions below:

Download selected workspace zip file to local computer

Now go to another project, go to the Workspace Tab and create a new JupyterLab Server.
After creating the new server, launch JupyterLab

Upload the zip file to the new server using the JupyterLab Upload Files function.

Screenshot 2023-08-24 at 10.28.19 AM-1.png

Now using the JupyterLab Terminal, use the linux command unzip *.zip to restore the files in the new workspace. If you are asked to overwrite hidden files, select no.
After complete, use the linux command rm *.zip to remove the zip archive file, since it is no longer needed.
Publish to Flywheel to save the workspace into Flywheel, where it can then be loaded again.

Starting JupyterLab from the Jupyter Workspace

Anyone with appropriate permissions can start any server on the project. Starting the server allows you to specify the compute resources and then use JupyterLab in a virtual instance. The server will be populated with the home directory files that were previously published to Flywheel.

Selecting the Server to Launch

Go to the Flywheel project and select a server from the Jupyter Workspace and click the Launch JupyterLab button.

launch JupyterLab

Select the Size/Type of Server

Users can select a specific size/type of Server to create when they launch JupyterLab. The specifications of the server are provided in terms of Central Processing Unit (CPU) count, Random-Access Memory (RAM) size, and whether it has a Graphics Processing Unit (GPU).

Options will vary depending on the cloud provider and how the site is configured (contact Flywheel to learn more). See image below for specifications of the four example compute resources.

Screenshot 2023-08-22 at 8.05.35 AM.png

Server Startup

After a server startup, the screen will show an indicator of progress to server startup. The timing is dependent on current provisioned VMs as well as provisioning time required by the cloud provider.

Sites are configured to have a certain number of warm servers ready to use. The default is 1. If a warm server is available the user will get their server more or less immediately. If there are none available a new server will be requested, in this case provisioning may take two to ten minutes, depending on the size of the server, and cloud vendor provisioning time.

Servers with GPUs are never kept warm due to cost and these servers may take more time to be provisioned, the amount of time depends on the cloud provider’s provisioning time. When the server is starting, it can’t be stopped. Once it is running, it can be stopped. The status column will show the status of the server.

Opening JupyterLab

Once a Server is running, click the "Open" button to access the JupyterLab server via a separate browser window. If there were any files published to Flywheel from a past server session of JupyterLab, they will appear in the JupyterLab instance.

open JupyterLab

Tip

If you happen to close the new browser window for JupyterLab, you can reopen it, to do so, go to the associated server and click open. An instance should start again with the file contents loaded.

Info

If you are in the JupyterLab terminal view your linux user will be jovyan. JupyterLab is containerized when running, with a single user named jovyan. This is isolated to the container, and although other users will be under the same name in their container instance, these are independent users who happen to have the same name in two containers.

Logging Out

When a user logs out of Flywheel, they will also log out of the Notebook. Using the log out option in JupyterLab to log out will log you out of JupyterLab, but not Flywheel. The notebook server will stay running and not be stopped by this action.

Note

There may be some delay in the JupyterLab logout action due to the inherent design of the Jupyter application.

Using the JupyterLab Server

The JupyterLab Server contains the open source JupyterLab application and Flywheel features to streamline working with Flywheel data, files, and the SDK.

Since JupyterLab is an open source tool, there are many useful sources of material found online. Check the official documentation for JupyterLab.

You can read more about getting started with Notebooks here.

Currently Flywheel uses version v3.6.3 of JupyterLab. Flywheel’s JupyterLab image contains the Jupyter/scipy-notebook image along with some additional packages, such as the current version of the Flywheel SDK and CLI to streamline access to Flywheel.

JupyterLab Files and Folders

The files and folders stored in the Flywheel archive are automatically uploaded into the JupyterLab server instance. This allows users to pick up where they left off using JupyterLab.

This also allows a different user to start a JupyterLab instance in a workspace using another user’s files and collaborate on notebooks and other work.

If there are no files/folders in the server it is because it was newly created, or none were published to Flywheel, so the user will see just an empty folder named "Data".

The purpose of the Data folder is that any files put in this folder are assumed to be Flywheel data or temporary files, importantly files in the "Data" folder will not be brought back to Flywheel when the user Publishes to Flywheel.

Screenshot 2023-08-24 at 10.44.09 AM.png

Advanced Feature: Control over Published Files and Directories

Publishing recognizes a hidden file called .publish-ignore.txt that you can edit to limit what will be published to Flywheel. Any file or directory paths listed in this file will be ignored from the publishing process. This file can be edited from the terminal using the vi or nano editor.

Publishing Back to Flywheel

Publishing the home directory back to Flywheel allows one to pick up where they left off on and allow other users to collaborate on notebooks with you. Before stopping the JupyterLab server, users need to publish their work back to Flywheel in order to preserve their changes.

Under the JupyterLab File menu there is a menu item called Publish to Flywheel to save your work to Flywheel.

This will first perform a local save-all so that the user's latest edits are saved into the local files. This will also publish all the folders and files back to Flywheel, except the folder named Data.

Anything in the Data folder will be excluded. Users can use the Data folder for holding data files on a temporary basis. Since these data files are also in the Flywheel database, it may not make sense to also publish these back to Flywheel. Any paths defined in the .publish-ignore.txt is also ignored. If users want to intentionally publish data files, then they can create another folder with a different name and it will be published back to Flywheel. Empty folders and hidden files/folders with names starting with a . will also be published to the workspace.

Note

JupyterLab has an existing File menu item to Save Workspace, which saves the JupyterLab UI configuration and is not related to the Flywheel Workspace or Publish to Flywheel.

Warning

Publishing over 200MB may lead to delays. Publishing creates a zipped compressed file of the contents of the user's home directory. If this turns out to be a large file, expect some delay in publishing. It's recommended to rely on the Flywheel database as the source of files and metadata for notebook sessions, and not duplicating Flywheel data in the zip file.

Notebook Convenience Features

The Flywheel Notebook has a number of convenience features for working with Flywheel data, files, the CLI and software libraries.

Using the Python SDK and CLI within a Flywheel Notebook

The Flywheel Notebook has a few convenience features when working with the Python Software Development Kit (SDK) and Command Line Interface (CLI). A compatible version of the Flywheel Python SDK and CLI is automatically installed into the JupyterLab server for users to use.

Using the CLI in JupyterLab

Legacy CLI

In JupyterLab open the terminal from the Launcher page.
Use the following command to log in to the Flywheel instance with your Application Programming Interface (API) key:
```
fw login $FW_HOSTNAME:$FW_WS_API_KEY
```

This should return -

"*You are now logged in as: (Username)!"*

You can now use the CLI as usual

Beta CLI

In JupyterLab open the terminal from the Launcher page.
Use the following command to show your API key being used for the JupyterLab session:
```
echo $FW_HOSTNAME:$FW_WS_API_KEY
```
Use the following command to start the log in to the Flywheel instance:
```
fw-beta login
```
This will prompt you for the API key.
Enter the API key exactly as shown in step 2 at the prompt (copy and paste).
This should return -
```
"*Logged in to <site.url> as <user>"*
```
You can now use the CLI as usual

Benefits of using the CLI in a JupyterLab server

Since the CLI is running on the JupyterLab server, it is in the same cloud account as the Flywheel application so this maximizes performance.
Using this approach for CLI operations vs your local machine means your local machine is not tied up for long operations. Users do not have to maintain the CLI on their local machine.
Flywheel maintains the compatibility of the CLI version and the Flywheel application version so the users does not have to.

Using the Flywheel Python SDK in Notebooks

When users are in a Notebook they can use the Flywheel SDK client without having to specify the API key. Also, Git and Conda tools are installed in the server to allow working with source code repositories and installing Conda compatible packages. There are also convenience features for accessing and using Flywheel Exports as a data source for the notebook.

Imported Flywheel SDK and Preconfigured Flywheel Client

In the notebooks, the Flywheel Python SDK is automatically imported and the variable fw is already initialized with user credentials, so they will not need to be provided.

Example usage:

fw.get_current_user().id

This will return your user id.
Also in the notebook, the Flywheel project is already initialized. This can be referenced by using either fw.workspace_project or fw_project in the notebook.

Using Flywheel Exports with a Flywheel Notebook

Flywheel's Project Exports can also be used to make data in Flywheel accessible to the notebook via cloud storage. This requires setting up an external storage provider at the group or project level.

In the notebook there is a simplified way to access the cloud storage using the Flywheel Storage Client, which is already installed in the notebook Server.

The high level steps are as follows:

Define an External Storage at the group or project level for the project(s) you wish to export. It must be at the group or project level, not the site level. If more than one is selected, the first one is used.
Run an Export in the project to the external storage with the desired data following the instructions for Project Exports.
Access the External storage using the Flywheel storage client in the notebook. Access keys and authentication will be managed by Flywheel behind the scenes. See details in the section below.

Using the Flywheel Storage Client in the Notebook

The Flywheel storage client technical documentation can be found at flywheel-io / tools / lib / fw-storage · GitLab.

In the Notebook, the Flywheel storage client is already initialized as fs and has access to the defined Storage Destination.

Example usage to write a local file to the storage bucket
```
fs.set(“test/image.jpg” ,“./image.jpg”)
```
Example usage to read a file from the storage bucket
```
file  = fs.get(“test/image.jpg”)
```

Using Open Source Tools Git and Conda From the JupyterLab Terminal

The Python package installer pip is available in the JupyterLab server.
Git is installed in the JupyterLab server
Git is available from the JupyterLab terminal using the command line: $ git
Conda is installed in the JupyterLab server
Conda is available from the JupyterLab terminal using the command line: $ conda

Example Notebooks

Some notebooks that show basic examples of the above features can be found at: gitlab:flywheel-notebooks-getting-started.

Relevant Permissions

The following are the current relevant permissions:

JupyterLab Permissions

Read

Allows users to view the list of JupyterLab Servers in the Workspace and to download the home directory stored in Flywheel. This is a default permission that can not be disabled.

Launch and Publish

Allows users to Launch their own and others JupyterLab Servers, creating a virtual compute with the JupyterLab App in a separate browser window. Also allows users in JupyterLab to Publish home directory files back to Flywheel, and finally to stop the server.

Create

Allows users to Create a new JupyterLab Server, specifying it's settings.

Modify

Allows users to Edit any existing JupyterLab Server settings.

Delete

Allows users to permanently Delete any JupyterLab Server, including the home directory files in Flywheel for that server.

Warning

Deleting a JupyterLab Server is permanent. It's recommended to download the contents (home directory files) and archive them if there is a need to retain this information.

Default Roles

Default Role	JupyterLab Permissions
read-only	Read
read-write	Read, Launch and Publish, Create, Modify
admin	Read, Launch and Publish, Create, Modify, Delete

Technical Information

Flywheel Notebooks are supported on the following providers:

Google Cloud Platform
Amazon Web Services
Microsoft Azure

Sites have flexibility to configure the 4 types of Jupyter server instances shown to the user at install time so that they can leverage different cloud vendors Virtual Machine (VM) families more effectively. A site can be configured with a maximum of 100GB RAM for a server. Also, single GPU is supported.

Compute Instances

Supports 4 levels of server (General Purpose, Compute, Compute-GPU, Larger-Compute-GPU)
Specs are dependent on site configuration, cloud provider and account SKU availability.
Single GPU
Single Jupyter ‘kernel’ - jupyter-scipy-notebook

Tip

Compute instances can be configured by Flywheel. Contact Flywheel support by submitting a ticket or emailing us at support@flywheel.io.

Compute Resources

Separate kubernetes cluster different from Flywheel Enterprise - no resource overlap
One warm cloud VM (w/o GPU) is ready for users to reduce wait time (site configurable)
Wait time for VM w/ GPU and additional servers is cloud dependent (~10min)
Idle JupyterLab servers are shut down after 2 hrs (site configurable)

Terminology

JupyterHub Terminology	Definition
*JupyterHub (Hub)*	- Service that spawns, manages, and proxies multiple JupyterLab server instances - One per Flywheel Platform, managed by Flywheel admins, not accessible by users
*JupyterLab Server Instance (Server)*	- A singular instance of file storage combined with a compute resource - Defined at the Project level and accessible by all Project users
*Kernel*	- A virtual environment with installed python packages and the python interactive session
Jupyter Notebook (Notebook)	- Structured data that represents code, metadata, content, and outputs - Stored in a .ipynb file via structured json - A JupyterLab server instance can contain multiple Notebook definitions