Using the fw ingest template
Command
Introduction
This article gives an overview of the options for the ingest template command. For a detailed explanation on how to create a template for uploading data, see the guide.
Usage
Instruction Steps
Required Arguments
Required argument | Description |
---|---|
TEMPLATE | Path to the template file |
SRC | Path to the folder containing data for ingest |
Optional Arguments
Tip - If you are using multiple optional arguments in your command, consider creating a config file. See the article for more information on creating and using a config file.
Optional Arguments | Description |
---|---|
-h, --help | show this help message and exit |
--compression-level | The compression level to use for packfiles -1 by default. 0 for store. A higher compression level number means more compression |
--copy-duplicates | Upload duplicate files to a new project when using --detect-duplicates The project created for the duplicates will be named: [target project label] _TimestampOfIngest. The duplicates project will have the same permissions as the original destination project. No additional project artifacts need to be copied (gear rules, data views, description, etc). |
--de-identify | De-identify DICOM files, e-files and p-files prior to upload. Applies custom de-identification when used with a deid-profile set in a configuration file. By default, it will apply the minimal de-identification to DICOM files: Remove Patient ID, Patient Name, and Patient Birthdate Patient Age convert to months |
--deid-profile NAME | Use the De-identify profile by name. Use this flag if you have multiple de-id profiles in a single config. |
--detect duplicates | Flywheel scans both the source dataset and any data already in the Flywheel project. Flywheel looks for the following: File path conflict in the source dataset- File path in the upload dataset is not unique File path conflict in Flywheel - file already exists Duplicate StudyInstanceUID in source datasetDuplicate StudyInstanceUID in Flywheel - UID already existsDuplicate SeriesInstanceUID in the source datasetDuplicate SeriesInstanceUID in Flywheel - UID already exists* Duplicate SOPInstanceUID in series |
--detect-duplicates-project DETECT_DUPLICATES_PROJECT | Specify one or multiple project paths to use for detecting duplicates |
--enable-project-files | Enable file uploads to a project container |
--encodings ENCODINGS | Set character encoding aliases. E.g. win_1251=cp1251 |
--exclude PATTERN | Patterns of filenames to exclude. For example: Exclude a single file: --exclude="ReadMe.md" Exclude all files of a specific filetype: --exclude="*.md" |
--exclude-dirs PATTERN | Patterns of directories to exclude (default: []) For example: --exclude-dirs="Sub-01" excludes all files and folder within the Sub-01 folder. This means the following directories would not be uploaded: Sub-01/Sess01 Sub-01/Sess2/Acq02 |
--force-scan | Try to parse all files as DICOM files regardless of the DICM prefix. (default: False) (might want to use --include "\*" ) |
-g ID, --group ID | The id of the group if not in the folder structure |
--group-override ID | Force using this group id |
--ignore-unknown-tags | Ignore unknown dicom tags when parsing dicom files (default: False) By default ingest can fail if the tag is not private and has implicit VR and not in this list. These tags can be ignore with this cli option. Flywheel uses the pydicom dictionary to determine known tags |
--include PATTERN | Patterns of filenames to include (default: []) For example: Include a single file: --include="participants.csv" Include all files of a filetype: --include="*.dcm" |
--include-dirs PATTERN | Patterns of directories to include (default: [])--include-dirs="OHM/101-10\*" The regex wildcard means that this would include the directories like: OHM/101-105 OHM/101-106 * OHM/101-1011 Note: when S3 bucket is configured as source, this flag does not support regex wildcard match, only “starts with.” |
--load-subject PATH | Load subjects from the specified file |
--no-audit-log | Skip uploading audit log to the target projects |
--no-sessions | No session level (create a session for every subject) default: False) Use this flag if you do not have data to create sessions in Flywheel, but you do have subject data. Flywheel uses the corresponding subject label to create the session label. |
--no-subjects | No subject level (create a subject for every session) (default: False) Use this flag if you do not have data to create subjects in Flywheel, but you do have session data. Flywheel uses the corresponding session label to create the subject label. |
- p LABEL, --project LABEL | The label of the project, if not in folder structure |
--project-override LABEL | Force using this project label |
--require-project | Proceed with the ingest process only if the resolved group and project exists (default: False) By default, Flywheel creates a new group or project if an existing group ID or project label does not match what you designated in your command or template. This means if you mistype the group id or project label, your data will be uploaded in the wrong location. Use this flag to make sure your data is only uploaded to existing groups and projects. |
--set-var KEY=VALUE | Set arbitrary key-value pairs. This allows you to explicitly set simple string metadata in Flywheel. Currently --set-var can only handle simple string values. This means that you can use it to set, for example:session.label=Test01 session.uid=001 subject.firstname=Tracy You can also use it to set custom info fields, like: session.info.myfield=myvalue You can also set multiple fields at the same time: --ser-var session.label=foo``session.info.myinfo=foo However, you cannot use this to set non-simple string values, such as the session.tags |
--skip-existing | Looks at filenames for files already in your project, and skip import of files with the same filename. (default: False) |
--symlinks | Follow symbolic links that resolve to directories (default: False) |
General
Optional Argument | Description |
---|---|
-y, --yes | Assume the answer is yes to all prompts |
--ca-certs CA_CERTS | Set the effective local timezone for imports |
-C PATH, --config-file | Specify configuration options via config file. Learn more about how to create this file. |
-d, --debug | Turn on debug logging |
--no-config | Do NOT load the default configuration file |
-q, --quiet | Squelch log messages to the console |
--timezone TIMEZONE | Set the effective local timezone when importing DICOM data. The timezone flag will only set the timezone if you use the dicom scanner in your template. Here is a list with the accepted timezones. Warning: Timezone not updated on existing containers The timezone is set when Flywheel creates the group, project, session, or acquisition. The timezone field is not updated if you are uploading your data into an existing Flywheel hierarchy. |
-v, --verbose | Get more detailed output |
Reporter
These config options are only available when using cluster mode with the --follow argument or when using local worker.
Optional Argument | Description |
---|---|
--save-audit-logs PATH | Save audit log to the specified path on the current machine |
--save-deid-logs PATH | Save deid log to the specified path on the current machine |
--save-subjects PATH | Save subjects to the specified file |
Cluster Config
These config options apply when using a cluster to ingest data.
Optional Argument | Description |
---|---|
--cluster CLUSTER | Ingest cluster url (default: None) |
-f, --follow | Follow the progress of the ingest (default: False) |
Worker Config
These config options are only available when using local worker (--cluster is not defined)
Optional Argument | Description |
---|---|
--jobs JOBS | The number of concurrent jobs to run (e.g. scan jobs), ignored when using cluster (default: 4) |
--sleep-time SECONDS | Number of seconds to wait before trying to get a task (default: 1) |
--max-tempfile MAX_TEMPFILE | The max in-memory tempfile size, in MB, or 0 to always use disk (default: 50)The max in-memory tempfile size, in MB, or 0 to always use disk (default: 50) |