How to Run a Bulk Import
Instructions
If any sort of non-default mapping rules will be used, then the new (BETA) CLI will be needed to start the import.
The options for specifying user-defined mappings are extensive and described in more detail in the new (BETA) CLI documentation for the import run
command.
Tip
If the source data is stored on your local machine, the easiest and quickest option to get the data into Flywheel is to use the new uploader in the Flywheel Core Web App to upload the data.
The option for uploading data from your local machine is not currently available via the new CLI.
Refer to the documentation on How to import data using the web app for more information.
0. Prerequisite -- CLI Installation
If the new (BETA) CLI is not already installed, follow the new (BETA) CLI installation instructions to install it.
An API key will be needed to sign in to the new (BETA) CLI and run commands. This can be the exact same API key used for the Legacy CLI.
If a new API key is needed, follow the documentation for creating a user API key.
1. Prepare source data
The first step is to decide how what information will be used to determine how the source will be mapped to the Flywheel hierarchy.
- If your data consists of only DICOM files, the easiest option is to use the default DICOM header-based mapping rules and let Flywheel do the rest of the work. In this case,
- Flywheel will derive the destination container labels from the DICOM headers,
- Each source folder must contain exactly 1 DICOM Series, and
- Each DICOM Series must be fully contained in exactly 1 source folder.
- If your data consists of files other than DICOM, then you must must carefully organize your source dataset into a folder structure that matches the desired Flywheel Hierarchy. In this case,
- Flywheel will derive the destination container labels from the source file paths (folder names).
Refer to the documentation on Mapping to the Flywheel Hierarchy for more details of the various options for prepare your source data for mapping to the Flywheel Hierarchy.
2. Register external storage
Before an Import can be started, Flywheel must first be configured with information about where to find the source data and how to access it. This is done by creating a new "External Storage" within Flywheel for the bucket location where the source data is stored.
Follow the documentation explaining how to create a new "External Storage" registration in Flywheel.
3. Start the import
-
Login using the new (BETA) CLI by running the following command:
-
The CLI will then prompt for the API key. Enter your API key when prompted.
-
Locate the ID of the External Storage containing the source data by running the following command:
-
The CLI will then print out a list of the most recently-created storages. If your storage is not listed, it may be older and on a later page.
-
To view more storages, take the ID of the last storage in the list, then run the following command:
Where
<id>
is replaced with the ID of the last storage in the list.This will display the next set of storages.
Tip
There are also other options for filtering, sorting, and changing the length of the list to help locate a particular External Storage. These options can be listed using the following command:
-
Once you have the ID of the External Storage containing the source data, start the import using the following command:
Where:
<project>
is replaced with the Flywheel Hierarchy of the destination project (e.g.,fw://demo/Alzheimers
)- Make sure to wrap this value in double-quotation marks if it contains any spaces (e.g.,
"fw://flywheel/Brain Tumor Progression"
)
- Make sure to wrap this value in double-quotation marks if it contains any spaces (e.g.,
<storage>
is replaced with the ID of the External Storage containing the source data
-
Once the import begins, the progress of the import will be displayed in the CLI output.
- You can exit out of this progress display by pressing
Ctrl+c
or by simply closing the terminal window altogether. This has no effect on the import job itself -- the import will still continue to run.