Ingest a Structured Folder of Imaging Data using the CLI
The ingest folder
command should be used if you are importing data that you previously downloaded from Flywheel, or if you are importing data that was shared from another Flywheel site.
The folder import command uses the folder structure to map your data to the Flywheel hierarchy. Learn how to run the ingest folder command.
Below is the basic Flywheel hierarchy:
Usage
fw ingest folder [optional arguments] [SRC]
Required Arguments
Required Argument | Description |
---|---|
SRC | The parent folder for the directory you want to import |
Optional Arguments
Tip
If you are using multiple optional arguments in your command, consider creating a config file. See our article for more information on creating and using a config file.
Ingest Folder
Optional Argument | Description |
---|---|
--dicom NAME | The name of dicom subfolders to be zipped prior to upload (default: dicom ). |
--force-scan | Try to parse all files as DICOM files regardless of the DICM prefix (default: false ). Consider using --include "\*" . |
--group-override ID | Force using this group ID. |
--no-sessions | No session level (create a session for every subject)(default: False). Use this flag if you do not have data to create sessions in Flywheel, but you do have subject data. Flywheel uses the corresponding subject label to create the session label. |
--no-subjects | No subject level (create a subject for every session)(default: False). Use this flag if you do not have data to create subjects in Flywheel, but you do have session data. Flywheel uses the corresponding session label to create the subject label. |
--pack-acquisitions TYPE | Acquisition folders only contain acquisitions of TYPE and are zipped prior to upload. |
-p LABEL , --project LABEL | The label of the project, if not in folder structure. |
--project-override LABEL | Force using this project label. |
--root-dirs ROOT_DIRS | The number of directories to discard before matching (default: 0). |
Ingest General
Optional Argument | Description |
---|---|
--compression-level | The compression level to use for packfiles -1 by default. 0 for storage. A higher compression level number means more compression. |
--de-identify | De-identify DICOM files, e-files and p-files prior to upload. Applies custom de-identification when used with a deid-profile set in a configuration file. By default, it will apply the minimal de-identification to DICOM files: * Remove PatientID , PatientName , and PatientBirthdate * PatientAge convert to months. |
--deid-profile NAME | Use the De-identify profile by name. Use this flag if you have multiple de-ID profiles in a single config. |
--detect-duplicates | Flywheel scans both the source dataset and any data already in the Flywheel project. Flywheel looks for the following: * File path conflict in the source dataset - File path in the upload dataset is not unique * File path conflict in Flywheel - file already exists Duplicate StudyInstanceUID in source dataset* Duplicate StudyInstanceUID in Flywheel - UID already existsDuplicate SeriesInstanceUID in the source dataset* Duplicate SeriesInstanceUID in Flywheel - UID already exists* Duplicate SOPInstanceUID in series |
--detect-duplicates-project DETECT_DUPLICATES_PROJECT | Specify one or multiple project paths to use for detecting duplicates. |
--copy-duplicates | Upload duplicate files to a new project when using --detect-duplicates .The project created for the duplicates will be named [target project label]_[TimestampOfIngest] . The duplicates project will have the same permissions as the original destination project. No additional project artifacts need to be copied (gear rules, data views, description, etc). |
--enable-project-files | Enable file uploads to a project container. |
--encodings ENCODINGS | Set character encoding aliases. E.g. `win_1251=cp1251``. |
--exclude PATTERN | Patterns of filenames to exclude (default: none). For example: * Exclude a single file: --exclude="ReadMe.md" * Exclude all files of a specific filetype: --exclude="*.md" |
--exclude-dirs PATTERN | Patterns of directories to exclude (default: none). For example: --exclude-dirs="Sub-01" excludes all files and folder within the Sub-01 folder. This means the following directories would not be uploaded: * Sub-01/Sess01 * Sub-01/Sess2/Acq02 |
-g ID , --group ID | The ID of the group if not in the folder structure. |
--ignore-unknown-tags | Ignore unknown dicom tags when parsing dicom files (default: false ). |
--include PATTERN [PATTERN ...] | Patterns of filenames to include (default: none). For example: * Include a single file: --include="participants.csv" * Include all files of a filetype: --include="*.dcm" |
--include-dirs PATTERN [PATTERN ...] | Patterns of directories to include (default: none). For example: * --include-dirs="OHM/101-10*" .The regex wildcard means that this would include the directories like: * OHM/101-105 * OHM/101-106 * OHM/101-1011 .Note: When S3 bucket is configured as source, this flag does not support regex wildcard match, only "starts with." |
--load-subject PATH | Load subjects from the specified file. |
--no-audit-log | Skip uploading audit log to the target projects. |
--require-project | Proceed with the ingest process only if the resolved group and project exists (default: false ).By default, Flywheel creates a new group or project if an existing group ID or project label does not match what you designated in your command or template. This means if you mistype the group ID or project label, your data will be uploaded in the wrong location. Use this flag to make sure your data is only uploaded to existing groups and projects. |
--skip-existing | Looks at filenames for files already in your project, and skip import of files with the same filename. (default: false ). |
--symlinks | Follow symbolic links that resolve to directories (default: false ). |
De-identifying Data
If you are using project, group, or site de-ID profiles to de-identify data, verify that your DICOMs are not compressed in a zip files because the ingest folder command cannot de-identify zipped DICOMs. Use the --dicom
flag to indicate the folder with your images to be zipped before upload.
For example, let's say you have the following file structure:
The command for de-identifying those DICOMs using a de-ID profile looks like:
fw ingest folder -g groupid -p <ProjectName> --dicom dcm001 ~/Desktop/MyData
Reporting Config
Applicability
These config options are only available either when using --cluster
mode with the --follow
argument or when using local worker.
Optional Argument | Description |
---|---|
--save-audit-log PATH | Save audit log to the specified path on the current machine (default: none) |
--save-deid-log PATH | Save de-id log to the specified path on the current machine (default: none) |
--save-subjects PATH | Save subjects to the specified file (default: none) |
Cluster Config
Applicability
These config options apply when using a cluster to ingest data.
Optional Argument | Description |
---|---|
--cluster CLUSTER | Ingest cluster url (default: none) |
-f , --follow | Follow the progress of the ingest (default: false ) |
Worker Config
Applicability
These config options are only available when using local worker (--cluster
is not defined).
Optional Argument | Description |
---|---|
--jobs JOBS | Number of concurrent jobs to run (e.g. scan jobs), ignored when using cluster (default: 4) |
--sleep-time SECONDS | Number of seconds to wait before trying to get a task (default: 1) |
--max-tempfile MAX_TEMPFILE | Maximum in-memory tempfile size, in MB, or 0 to always use disk (default: 50) |
General
Optional Argument | Description |
---|---|
-h , --help | Show help message and exit. |
-C PATH , --config-file | Specify configuration options via config file.* |
--no-config | Do NOT load the default configuration file. |
-y , --yes | Assume the answer is yes to all prompts. |
--ca-certs CA_CERTS | Path to a local Certificate Authority certificate bundle file. This option may be required when using a private Certificate Authority. |
--timezone TIMEZONE | Set the effective local timezone to use when uploading data. |
-q , --quiet | Squelch log messages to the console. |
-d , --debug | Turn on debug logging. |
-v , --verbose | Get more detailed output. |