Ingest Template Reference Guide

Below is a reference guide for all possible template options as well as example templates.

File Selector Options

Pattern

The - pattern step specifies what Flywheel should do with the top-level folder of level of a directory. The first - pattern: field in your template corresponds to the parent folder in your dataset directory. Each subsequent - pattern field in your template walks down each level of folders within that top-level folder.

In general, you need to have a - pattern: field for each folder in your directory. This is because the template needs instructions for what to do at each folder in the directory.

Valid values for the - pattern:

Use a variable to set Flywheel labels for group, project, subject, session, acquisition based on the folder name
Skip that level of the directory by using regex: .*
Use select` to set different upload instructions if there are multiple folders at the same level.
Use scan to pull out Flywheel labels from a filename instead of folder name or to validate DICOM files

- pattern: "{group}"
- pattern: "Anxiety Study"
# Sets the project label to Anxiety Study no matter what the folder 
# name is in your dataset 
- pattern: "{subject}"
- pattern: "anx_{session}" 
- pattern:"{acquisition}"
  packfile_type: zip

Select

Used to start an expression where you set parameters or logical operators for two folders at the same level of the directory.

You cannot nest a select statement underneath a select statement.

- pattern: "{group}"
- pattern: "{subject}"
- pattern: "{session}"
- pattern: "{acquisition}"
- select:
  - pattern: "*.dcm"
    packfile_type: dicom
  - pattern: .*

The above example packs up all files with the extension .dcm and compresses them into a zip file. The zip file is uploaded with as an acquisition with the acquisitionlabel.dicom.zip. All other files are ignored and not uploaded.

Scan

Scans can either be filename or dicom. Using scan is optional, but should be used if you are uploading DICOM data or if you want to parse a filename to use as a metadata label in Flywheel.

You will define the specific scan type below. Below is an example of a complete scan step in the profile:

- pattern: "{subject}"
- pattern: "{session}"
  scan:
    name: filename
    pattern: "{acquisition}.dcm"

Name

The name field configures the type of scan.

Dicom

When the scan step is set to dicom, Flywheel reads through all the files within that step of the hierarchy. Flywheel then parses all files with the .dcm extension. If the file is not a valid DICOM file, the file is not uploaded, and the import stops by default. To determine if a file is valid DICOM, we look for a DICM string at byte 128.

However, you can use the -force-scan flag in your CLI command to parse all files as DICOM regardless of the DICM prefix and upload them to Flywheel.

- pattern: "wimrpetct{subject}_{session}"
- pattern: ".*"
- pattern: "{acquisition}"
  scan: 
    name: dicom

Filename

Use the filename scan to parse the file names within that step of the directory. This allows you to pull out relevant parts of a filename to create labels and add metadata.

Then used in combination with regex, you can loop through all files and use the same piece of the file name string from the files. For example, let's say that all of your images files have been named using the following naming scheme:

[date]_[study ID]_[subject number]_[acquisition number] . The file names would look something like this:

20120215_2340_SUBJ1_acq2.dcm
20120215_2340_SUBJ1_acq3.dcm
20120215_2340_SUBJ1_acq4.dcm
20120215_2340_SUBJ1_acq5.dcm
etc.

Use the piece of the filename representing the acquisition number acq2, acq3, acq4, to set the acquisition label in Flywheel. To do this for all files the folder we can add regex pattern matching along with the Flywheel field name in brackets <>.

- pattern: "{project}" 
- pattern: "{subject}"
- pattern: "{session}"
  scan:
    name: filename
    pattern:"^(?:[^_]*_){3}(?P<acquisition>[^.]*)"

Regex can quickly become complex. You should try out your regex before adding it to your template. See regex101 test out regex.

Grouping & Zipping Options

packfile_type

Groups all files within that level of the the directory, compresses them as a single zip file, and uploads them as an acquisition. You can specify a packfile type for the value. However, it is important to note that this setting does not validate the type of file before adding to the zip. The packfile is added to your acquisition label and becomes the type in the acquisition metadata.

- pattern: "development"
- pattern: "Example"
- pattern: "{subject}" 
- pattern: "{session}" 
- pattern: "{acquisition}"
  packfile_type: png

This would result in all files to be uploaded as {acquisition}.png.zip

packfile_name

Overrides the default packfile name. Do not include quotes around the name.

- pattern: "development"
- pattern: "Emily Example"
- pattern: "{subject}" 
- pattern: "{session}" 
- pattern: "{acquisition}"
  packfile_type: dicom
  packfile_name: Historical_data

The example above would change the name of the packfile from {acquisition}.dicom.zip to Historical_data.dicom.zip

Include/Exclude Options

Use the include/exclude settings to filter out files based on file type or based on directory name. Any additional upload settings can also be configured in this section.

...

# Use regex to indicate file types to exclude from upload
exclude:  
  - "*.txt"
  - "*.xml"

# Use regex to indicate directories to include in upload
include-dirs:  
  - "*-DCM"

Full Example

This example ingest config file detects duplicate files, excludes TXT and XML filetypes from upload, and includes directories that end in -DCM.

#####
# Template Settings
#####

template:
  - pattern: "{subject}"
  - pattern: "{session}"
  - pattern: "{acquisition}"
    name:
      scan: dicom

####
# De-Identification Settings
####
name: Profile1
dicom:

  # Date-increment controls how many days to offset each date field
  # where the increment-date (shown below) is configured.
  # Positive values will result in later dates, negative
  # values will result in earlier dates.
  date-increment: -17

  # patient-age-from-birthdate sets the DICOM header as a 3-digit value with a suffix
  # be 091D, and that same age in months would be 003M. By default, if
  # the age fits in days, then days will be used,
  # otherwise if it fits in months, then months
  # will be used, otherwise years will be used
  patient-age-from-birthdate: true

  # all data elements not defined in fields section of the profile will be removed. 
  # If any field references a nested element in a sequence the whole sequence element 
  # will be kept.
  remove-undefined: true

  # Set patient age units as Years. Other options include months (M) and days (D)
  patient-age-units: Y

  # The following are field transformations.
  # Remove, replace-with, increment-date, hash, and hashuid can be used with any DICOM
  # field. Replace name with the DICOM field "keyword" by the DICOM standard
  fields:

    # Use remove Remove a dicom field Removes the field from the DICOM entirely.
    # If removal is not supported then this field will be blank.
    # This example removes PatientID.
    - name: PatientID
      replace-with: REDACTED

    # Replace a dicom field with the value provided.
    # This example replaces “StationName” with "XXXX" in Flywheel
    - name: StationName
      replace-with: XXXX

    # Offsets the date by the number of days defined in
    # the date-increment setting above, preserving the time
    # and timezone. In this example, StudyDate appears as 17 days earlier
    - name: StudyDate
      increment-date: true

    # You can refer to fields by their DICOM tag or keyword
    # Applies one-way hash to a unique string
    - name: (0008,0050)
      hash: true

    # Replaces a UID field with a hashed version of that
    # field. The first four nodes (prefix) and last node
    # (suffix) will be preserved, with the middle being
    # replaced by the hashed value
    - name: ConcatenationUID
      hashuid: true

    # The fields below are listed so that they are not removed as part of the 
    # remove-undefined setting above. 
    - name: SeriesInstanceUID
    - name: Modality
    - name: SeriesNumber
    - name: ScheduledProcedureStepID
    - name: RequestedProcedureID
    - name: StudyTime
    - name: StudyID
    - name: SeriesNumber
    - name: PatientID
    - name: StudyInstanceUID
    - name: ProtocolName
    - name: AcquisitionDate
      increment-date: true
    - name: AcquisitionDateTime
    - name: AcquisitionTime
    - name: SeriesDate
      increment-date: true
    - name: SeriesTime

####
# Include/Exclude Settings 
####

# Flywheel uploads only the new data and does not upload duplicates. 
# all duplicates are noted in the ingest audit log attached to the project

detect-duplicates: true

# Use regex to indicate file types to exclude from upload
exclude:  
  - "*.txt"
  - "*.xml"

# Use regex to indicate directories to include in upload
include-dirs:  
  - "*-DCM"

Metadata Options

Variables for Configuring Flywheel Metadata

The following are the variables used in the template file for Flywheel labels. Use the template variable to map all or part of a file or folder name to the equivalent Flywheel metadata field:

Template variable	Flywheel field
{group}*	group._id
{project}*	project.label
{subject}	subject.label
{session}	session.label
{acquisition}	acquisition.label

* While you can use the {group} and {project} variable in your template, whatever you use for the group and project in your command will override whatever is in the template.

Setting additional Flywheel Metadata

Groups: group.label
Projects: project.id
Subjects: subject._id
Sessions: session._id, session.uid, session.timestamp
Acquisitions: acquisition._id, acquisition.uid, acquisition.timestamp

Use the following format to assign these fields if you are not using regex:

- pattern: "{subject._id}"

Setting Custom Metadata

You can also set custom metadata in the template. Custom metadata can help you create data views or run analysis. Custom metadata fields following this naming convention: [container].info.[fieldName]

For example, If a custom metadata field called RedCapID applies to subjects, the field name would look like subject.info.RedCapID. One exampleTo assign this custom metadata:

- pattern: "{subject}_(?P<subject.info.RedCapID>.*)"