Skip to content

Using the fw ingest dicom Command

Introduction

The ingest dicom command reads through directories of DICOM files and uses the DICOM header information to group them into the correct Acquisitions and Sessions in Flywheel. The ingest dicom command requires Flywheel CLI version 11.0 or later.

This article explains the optional arguments for the ingest dicom command. See this article for directions on how to run the ingest dicom command.

Usage

fw ingest dicom [path to dicom data] [group id] [project label] [optional arguments]

Instruction Steps

Required Arguments

Required argument Description
SRC The path to the DICOM data you wish to import
GROUP_ID Enter an existing Flywheel group id or create a new group id
PROJECT_LABEL Enter an existing Flywheel project label or create a new project

By default, Flywheel uses the following DICOM tags to label data. If your scans are missing these tags,  the --subject and --session flags are also required to successfully upload your data.

Required DICOM tag Flywheel label CLI flag to use if missing
PatientID Subject label --subject <subjectLabel>
StudyDescription Session label --session <sessionLabel>

If your scans do have values for the PatientID and StudyDescription fields, you do not need to use these flags. However, you can use them to change the default labels for the subjects and sessions.

fw ingest dicom ~/Documents/MyData/Study001 mygroup "A project"   
--subject 01 --session 01

Optional Arguments

If you are using multiple optional arguments in your command, consider creating a config file. See the article for more information on creating and using a config file.

Optional arguments Description
-h, --help show this help message and exit
--compression-level The compression level to use for packfiles -1 by default. 0 for store. A higher compression level number means more compression
--copy-duplicates Upload duplicate files to a new project when using --detect-duplicates

The project created for the duplicates will be named [target project label] _TimestampOfIngest. The duplicates project will have the same permissions as the original destination project. No additional project artifacts need to be copied (gear rules, data views, description, etc).
--de-identify De-identify DICOM files, e-files and p-files prior to upload. Applies custom de-identification when used with a deid-profile set in a configuration file. By default, it will apply the minimal de-identification to DICOM files:

* Remove Patient ID, Patient Name, and Patient Birthdate
* Patient Age convert to months
--de-id profile NAME Use the De-identify profile by name. Use this flag if you have multiple de-id profiles in a single config.
--detect-duplicates Flywheel scans both the source dataset and any data already in the Flywheel project. Flywheel looks for the following:

* File path conflict in the source dataset- File path in the upload dataset is not unique
File path conflict in Flywheel - file already exists
Duplicate StudyInstanceUID in source dataset
Duplicate StudyInstanceUID in Flywheel - UID already exists
Duplicate SeriesInstanceUID in the source dataset
Duplicate SeriesInstanceUID in Flywheel - UID already exists
Duplicate SOPInstanceUID in series
--detect-duplicates-project DETECT_DUPLICATES_PROJECT Specify one or multiple project paths to use for detecting duplicates
--enable-project-files Enable file uploads to a project container
--encodings ENCODINGS Set character encoding aliases. E.g. win_1251=cp1251
--exclude PATTERN Patterns of filenames to exclude. For example:

* Exclude a single file:
--exclude="ReadMe.md"
* Exclude all files of a specific filetype
--exclude="*.md"
--exclude-dirs PATTERN Patterns of directories to exclude (default: [])

For example:--exclude-dirs="Sub-01"
excludes all files and folder within the Sub-01 folder. This means the following directories would not be uploaded:
* Sub-01/Sess01
* Sub-01/Sess2/Acq02
--force-scan Try to parse all files as DICOM files regardless of the DICM prefix. (default: False)
(might want to use --include "\*")
-g ID, --group ID The id of the group if not in the folder structure
--group-override ID Force using this group id
--ignore-unknown-tags Ignore unknown dicom tags when parsing dicom files (default: False)
By default ingest can fail if the tag is not private and has implicit VR and not in this list. These tags can be ignored with this cli option.
Flywheel uses the pydicom dictionary to determine known tags
--include PATTERN Patterns of filenames to include (default: [])

For example:
* Include a single file:
--include="participants.csv"
* Include all files of a filetype:
--include="*.dcm"
--include-dirs PATTERN Patterns of directories to include (default: [])
--include-dirs="OHM/101-10\*"
The regex wildcard means that this would include the directories like:
OHM/101-105
OHM/101-106
* OHM/101-1011
Note: When S3 bucket is configured as source, this flag does not support regex wildcard match, only “starts with.”
--load-subject PATH Load subjects from the specified file
--no-audit-log Skip uploading audit log to the target projects
--require-project Proceed with the ingest process only if the resolved group and project exists (default: False)

By default, Flywheel creates a new group or project if an existing group ID or project label does not match what you designated in your command or template. This means if you mistype the group id or project label, your data will be uploaded in the wrong location. Use this flag to make sure your data is only uploaded to existing groups and projects.
--session LABEL Override value for the session label
--skip-existing Looks at filenames for files already in your project, and skip import of files with the same filename. (default: False)
--subject LABEL Override value for the subject label
--symlinks Follow symbolic links that resolve to directories
(default: False)

General

Optional argument Description
-C PATH, --config-file Specify configuration options via config file. Learn more about how to create this file.
--no-config Do NOT load the default configuration file
-y, --yes Assume the answer is yes to all prompts
--ca-certs CA_CERTS Set the effective local timezone for imports
--timezone TIMEZONE Set the effective local timezone for imports
-q, --quiet Squelch log messages to the console
-d, --debug Turn on debug logging
-v, --verbose Get more detailed output

Reporter

These config options are only available when using cluster mode with the “--follow” argument or when using local worker.

Optional argument Description
--save-audit-log PATH Save audit log to the specified path on the current machine (default: None)
--save-deid-log PATH Save de-id log to the specified path on the current machine (default: None)
--save-subjects PATH--save-subjects PATH Save subjects to the specified file (default: None)

Cluster Config

These config options apply when using a cluster to ingest data. The cluster

Optional argument Description
--cluster CLUSTER Ingest cluster url (default: None)
-f, --follow Follow the progress of the ingest (default: False)

Worker Config

These config options are only available when using local worker (--cluster is not defined)

Optional argument Description
--jobs JOBS The number of concurrent jobs to run (e.g. scan jobs), ignored when using cluster (default: 4)
--sleep-time SECONDS Number of seconds to wait before trying to get a task (default: 1)
--max-tempfile MAX_TEMPFILE The max in-memory tempfile size, in MB, or 0 to always use disk (default: 50)The max in-memory tempfile size, in MB, or 0 to always use disk (default: 50)