Flywheel Notebooks provide an interactive compute option that allows researchers to easily apply Python data science tools within the Flywheel platform. Flywheel integrates with the familiar Jupyter Lab/Notebooks to combine Python code, reports, and documentation in a single document. The notebook servers are managed within Flywheel Projects and controlled via User permissions. The compute resources are configurable and managed by Flywheel.
Flywheel Notebooks have a wide array of applications, including the following:
- Data and Image Curation
- ML and AI Training and Inference
- Analytics and Reporting
- Automate Administrative Functions
How To Use Flywheel Notebooks
From the Flywheel project, use the kebob menu to select "Workspaces"
Next, select Create Workspace on the far right of the screen.
Name your new workspace
From the workspace, Launch JupyterLab. There is the option of selecting one of the configured compute resources.
After the workspace has been launched, the following message will appear, as shown in the image below.
Now click Open on the created workspace.
Once a selection is made, a new compute resource will be started. A new tab will appear for JupyterLab.
You can now use JupyterLab to create a new notebook.
Once a Notebook has been created/named, you may save the notebooks and other files as required.
It may prompt for the user to rename the file as shown in the image below.
IMPORTANT: When you are finished working with your session, select Publish to Flywheel to allow future collaboration on the Jupyter home directory.
The notebook created will now display on the far left hand screen in list form as shown in the image below. Your JupyterLab session can now be closed.
Back on the workspaces list, you can click "Stop" to shut down the workspace and free up the compute and storage resources
Each Flywheel project can have one or more workspaces. Workspaces have file storage and allow creation and usage of JupyterLab servers. All users with access to the project and with the permission to Manage Workspaces can create, modify, start, or stop any of the workspaces. This approach allows a project team to collaborate on a set of workspaces containing Jupyter Notebooks.
Create a Workspace
Workspaces in Flywheel are used to launch Jupyterlab servers and also maintain the home directories of Jupyterhub servers. They support team collaboration on the workspace contents by providing project team members with appropriate access to any workspace. Team members can edit notebooks and other files and then publish those to Flywheel to share work.
Workspaces show the status of the Jupyterlab servers, indicating which are running and who last published the workspace.
Workspaces allow users to download a zip archive of the workspace contents. This file can be used to share the contents with other researchers.
Downloading and importing Workspaces
To download and import a workspace, please follow the workflow instructions below:
- Download selected workspace zip file to local computer
- Now go to another project, go to the Workspace Tab and create a new workspace
- After creating the new Workspace, one will launch Jupyterlab
- The user will now need to upload the zip file
- Now using the Jupyterlab Terminal, use the linux command unzip to restore the files in the new
- Publish to Flywheel to save the workspace into Flywheel, where it can then be loaded again.
Starting JupyterLab from a Workspace
Selecting the Workspace to Launch JupyterLab in
Go to the Flywheel project and select a workspace and launch JupyterLab. A JupyterLab instance should start in a separate tab after some startup time. The file contents of the Workspace should appear in the JupyterLab instance.
If you happen to close the new browser window for Jupyterlab, you can reopen it, to do so, go to the workspace associated for the server and launch Jupyterlab. An instance should start again with the workspace loaded.
Select the Size/Type of Server
Users can select a specific size/type of Server to create when they launch JupyterLab. The specifications of the server are provided in terms of Central Processing Unit (CPU) count, Random-Access Memory (RAM) size, and whether it has a Graphics Processing Unit (GPU). Options will vary depending on the cloud provider and how the site is configured (contact Flywheel to learn more). See image below for specifications of the four example compute resources.
After a server startup, an indicator of progress while waiting for the server to start will be shown on the screen. Sites are configured to have a certain number of warm servers ready to use. The default is 1. If a warm server is available the user will get their server more or less immediately. If there are none available a new server will be requested, in this case provisioning may take two to ten minutes, depending on the size of the server.
Servers with GPU’s are never kept warm due to cost and these servers may take more time to be provisioned, the amount of time depends on the cloud provider’s provisioning time.
When the server is starting, it can’t be stopped. Once it is running, it can be stopped. The status column will show the status of the server.
- Stopped - no server is running. A server can be started.
- Is starting… no action is possible.
- Is running… a server is running and the Open button will appear.
Once a workspace has a running server, click the "Open" button to launch JupyterLab in a separate browser window. If there were any files published to the workspace from a past server session of JupyterLab, they will appear in the Jupyterlab instance.
Stopping a Workspace Server
Stopping a workspace causes the server to be stopped and server storage to be deleted. The workplace storage remains so that you can start a new server and work where you left off. The user should remember to use the JupyterLab File menu item "Publish to Flywheel" before stopping a server to ensure work is saved to the workspace storage.
Before stopping another user’s workspace, the best practice is to contact a user before stopping their workspace so that they do not lose their work.
Deleting a Workspace
Users with appropriate permissions can delete a Workspace and Server to avoid wasting compute resources.
To do this, follow the steps below:
- Go to the list of Workspaces on the far left hand side of the screen and click on one to delete it.
- Stop the workspace if it is running and then delete it. This will remove the workspace from the list. Deleting a workspace deletes the workspace storage and only users with the ‘Delete Workspaces’ permission can delete workspaces.
JupyterLab is an open source tool and there are many useful sources of material found online. The official documentation for JupyterLab is here. You can read more about getting started with Notebooks here. Currently Flywheel uses version v3.6.3 of JupyterLab. Flywheel’s JupyterLab kernel contains the Jupyter/scipy-notebook image along with some additional packages, such as the current version of the Flywheel SDK to streamline access to flywheel.
When a user logs out of Flywheel, they will also log out of the Notebook. Using the log out option in JupyterLab to log out will log you out of JupyterLabs, but not Flywheel. The notebook server will stay running and not be stopped by this action.
Note: There may be some delay in this action due to the inherent design of the Jupyter applications.
Using Workspaces within JupyterLab
The files and folders stored in the workspace archive are automatically uploaded into the JupyterLab server instance. This allows users to pick up where they left off using JupyterLab. This also allows a different user to start a JupyterLab instance in a workspace using another user’s files and collaborate on notebooks and other work. If there are no files/folders in the workspace it is because it was newly created, so the user will see an empty folder named "Data".
The purpose of the Data folder is that any files put in this folder are assumed to be Flywheel data or temporary files, importantly files in the "Data" folder will not be brought back to the workspace when the user Publishes the workspace to Flywheel.
Publishing Workspace Back to Flywheel
Publishing a workspace back to Flywheel allows one to pick up where they left off on and allow other users to collaborate on them with you. Before stopping the JupyterLab server, users need to publish their work back to the Flywheel workspace in order to preserve their changes.
> Under the JupyterLab File menu there is a menu item called Publish to Flywheel to save your work to Flywheel.
This will first perform a local save-all so that the user's latest edits are saved into the local files. This will also publish all the folders and files into the Flywheel workspace except the folder named Data.
Anything in the Data folder will be excluded. Users can use the Data folder for holding data files on a temporary basis. Since these data files are also in the Flywheel database, it may not make sense to also keep copies in the workspace. If users want to intentionally put data files back into the workspace, then they can create another folder with a different name and it will be published back to Flywheel. Empty folders and hidden files/folders with names starting with a . will also be published to the workspace.
The following are the current relevant permissions:
- Users with Workspace Server Edit permission are able to edit/create/start/stop workspace servers. The list of Workspaces is not visible if you do not have this permission.
- Users with Workspace Delete Permission are be able to delete Workspaces.
- Users with the default Admin Role have the following two permissions for this feature:
- Manage Workspaces - Create/Modify/Start/Stop Workspaces
- Delete Workspaces
Using the Python SDK and CLI within a Flywheel Notebook
The Flywheel Notebook has a few convenience features when working with the Python Software Development Kit (SDK) and Command Line Interface (CLI). A compatible version of the Flywheel Python SDK and CLI is automatically installed into the JupyterLab server for users to use.
Using the CLI in JupyterLab
- In JupyterLab open the terminal from the Launcher page.
Use the following command to log in to the Flywheel instance with your Application Programming Interface (API) key:
fw login $FW_HOSTNAME":"$FW_WS_API_KEY
This should return -
"You are now logged in as: (Username)!"
- You can now use the CLI as usual
- Since the CLI is running on the JupyterLab server, it is in the same cloud account as the Flywheel application so this maximizes performance.
- Using this approach for CLI operations vs your local machine means your local machine is not tied up for long operations. Users do not have to maintain the CLI on their local machine.
- Flywheel maintains the compatibility of the CLI version and the Flywheel application version so the users does not have to.
Using the Flywheel Python SDK in Notebooks
When users are in a Notebook they can use the Flywheel SDK client without having to specify the API key. Also, Git and Conda tools are installed in the server to allow working with source code repositories and installing Conda compatible packages. There are also convenience features for accessing and using Flywheel Exports as a data source for the notebook.
In the notebooks, the Flywheel SDK is automatically imported and the variable fw is already initialized with user credentials, so they will not need to be provided.
Imported Flywheel SDK and Preconfigured Flywheel Client
This will return your user id.
Also in the notebook, the flywheel project is already initialized. This can be referenced by using either fw.workspace_project or fw_project in the notebook.
Using Open Source Tools Git and Conda From the JupyterLab Terminal
- Git is installed in the JupyterLab server
- Git is available from the JupyterLab terminal using the command line: $ git
- Conda is installed in the JupyterLab server
- Conda is available from the JupyterLab terminal using the command line: $ conda
Using Flywheel Exports with a Flywheel Notebook
Flywheel's Project Exports can also be used to make data in Flywheel accessible to the notebook via cloud storage. This requires setting up an external storage provider at the group or project level. In the notebook there is a simplified way to access the cloud storage using the Flywheel Storage Client, which is already installed in the notebook Server.
The high level steps are as follows:
- Define an External Storage at the group or project level for the project(s) you wish to export.
It must be at the group or project level, not the site level. If more than one is selected, the first one is used.
- Run an Export in the project to the external storage with the desired data following the instructions for Project Exports.
- Access the External storage using the Flywheel storage client in the notebook. Access keys and authentication will be managed by Flywheel behind the scenes. See details in the section below.
Using the Flywheel Storage Client in the Notebook
The Flywheel storage client technical documentation can be found at flywheel-io / tools / lib / fw-storage · GitLab. In the Notebook, the Flywheel storage client is already initialized as fs and has access to the defined Storage Destination.
- Example usage to write a local file to the storage bucket
- Example usage to read a file from the storage bucket
file = fs.get(“test/image.jpg”)
Some notebooks that show basic examples of the above features can be found at: gitlab:flywheel-notebooks-getting-started.
Flywheel Notebooks are supported on the following providers:
- Google Cloud Platform
- Amazon Web Services
- Microsoft Azure
Sites have flexibility to configure the 4 types of Jupyter server instances shown to the user at install time so that they can leverage different cloud vendors Virtual Machine (VM) families more effectively.
- Supports 4 levels of server (General Purpose, Compute, Compute-GPU, Larger-Compute-GPU)
- Specs are dependent on cloud provider and account
- Single GPU
- Single Jupyter ‘kernel’ - jupyter-scipy-notebook
- Separate kubernetes cluster different from Flywheel Enterprise - no resource overlap
- One warm cloud VM (w/o GPU) is ready for users to reduce wait time (site configurable)
- Wait time for VM w/ GPU and additional servers is cloud dependent (~10min)
- Idle Jupyterlab workspaces are shut down after 2 hrs (site configurable)
The section provides Jupyterhub terminology.
|Jupyter Hub (Hub)||
|Jupyter Lab Server Instance (Server)||
|Jupyter Notebook (Notebook)||
#jupyterhubnotebook #notebooks #workspaces #jupyter