Skip to content

Logo Logo

Step 1: Create a Template for File Upload

Introduction

The template is a set of instructions for how to map your dataset to a Flywheel project. 

All projects in Flywheel are made up of subjects, sessions, and acquisitions. Using the template in the config file, leverage the folder and file structure on your local machine to map your dataset to subjects, sessions, and acquisitions. The template also allows you to use folder or filenames from your dataset as additional metadata fields.

Note: This is different from the ingest dicom command, which uses the DICOM file's metadata to organize data.

Instruction Steps

Mapping Dataset Folder Structure to Flywheel

To successfully upload your dataset, you must define the labels for subjects, sessions, and acquisitions. Below is an example of the most basic template for uploading DICOM files.

The steps below explain how to build your own template for labeling subject, sessions, and acquisitions.

# Template and Group/Project Settings
#####

template:
  - pattern: "{subject}"
  - pattern: "{session}"
  - pattern: "{acquisition}"
      scan: dicom
  1. To begin, open a plain text editor such as TextEdit, Notepad, Sublime, etc.
  2. Copy and paste the first step of your template:
template:
  - pattern:

This first pattern field corresponds to the top folder of your dataset. 3. Configure how your first folder and any data inside it will map to the Flywheel project. Below is an overview of the available options. See the Reference section at the end of this article for more details.

  • Use the {subject}, {session}, or {acquisition} variables to create labels: To successfully upload your dataset, you must define the labels for subjects, sessions, and acquisitions in your dataset. You can use all or part of the folder name for the label. To use all of the folder name for the subject label:
template:
  - pattern: "{subject}"

For example if you had the following folder name it would become a subject label:

Example folder name Subject label
sub-01 sub-01

To use only part of the folder name, replace a portion of the folder with the "{subject}" variable. For example:

template:
  - pattern: "{subject}-subject-Nov-2021"
Example folder name Subject label
001-subject-Nov-2021 001
002- subject-Nov-2021 002
  • Add metadata: Metadata has many useful applications in Flywheel including being used to properly categorize data, create collections, or curate a dataview. The template allows you to pull folder or file names to add as metadata. For example:
template:
  - pattern: "001-subject-{session.info.dataset}"

For example

Example folder name Metdata
001-subject-Nov-2021 dataset: Nov-2021
Note: this metadata field is added to the sessions of your project
  • Skip this folder: Use regex to skip this folder level and move on to the next:
template:
  - pattern: .*
  • Use filenames for labels in Flywheel: Oftentimes the filename has useful information. To leverage the filename for a label, add a scan step for filenames.

The filename scan allows for regex pattern matching, so you can pull out only the relevant information in each filename. For example:

template:
  - pattern: "{subject}"
    scan:
      name: filename
      pattern: "(?P<acquisition>)[^.]*"

The above template would take the filename without the extension and make it the acquisition label. For example:

Example filename Acquisition label
SAG T1.dicom SAG T1

Tip

Tip: Use angled brackets <> instead of curly brackets {} when using regex to assign variables. This is a requirement of Python regex.

  1. Add a second -pattern: to the template. This corresponds to the first subfolder of your dataset. The teYou can use any of the above mapping methods. For example:
template:
  - pattern: "{subject}"
  - pattern: .*
  1. Continue for each new subfolder level in your dataset until you reach the last folder you wish to upload. To package files for upload, use packfile_type or scan:

  2. Compress all files into a packfile: If you don't have DICOM data, you should still compress all of your images for upload using packfile_type: <filetype>. This creates a single zip file for upload. For example:

template:
  - pattern: "{subject}"
  - pattern: "{session}"
  - pattern: "{acquisition}"
    packfile_type: jpg 

or

template:
  - pattern: "{subject}-subject-Nov-2021"
  - pattern: .*
  - pattern: "amyg_s*_amyg_{session}_{subject.info.pcol}"
  - pattern: "{acquisition}"
    packfile_type: dicom
  • Validate and package DICOM data: If you have DICOM data, and you would like to organize data using the seriesinstanceuid tag, use the dicom scan step:
template:
  - pattern: "{acquisition}"
    scan: dicom

When dicom scan is enabled, Flywheel parses all files with the .dcm extension in that folder. Flywheel pulls out relevant metadata from the DICOM files to use as Flywheel metadata and compresses all data into a zip file for upload.

If the file is not a valid DICOM file, the file is not uploaded and the import stops by default. To determine if a file is valid DICOM, we look for a DICM string at byte 128. However, you can use the --ignore-scan flag in your CLI command to set it so that Flywheel only ignores the invalid DICOM file and continues to upload valid files.

  • "Select" folders for upload: When there are multiple folders at the same level, you can use select: to set different rules for each folder:
template:
  - pattern: "{subject}-{session}-{acquisition}"
  - select:
    - pattern: "*.dcm"
      packfile_type: dicom
    - pattern: .*

This example only scans files that end in the common DICOM file extension, .dcm. 6. Review your template. Verify that you have added each of these variables once:

  • {subject}
  • {session}
  • {acquisition}

  • Verify that your template is valid YAML using an online tool such as YAML lint.

Reference

Below is a reference guide for all possible template options as well as example templates.

Pattern

The - pattern step specifies what Flywheel should do with the top-level folder of level of a directory. The first - pattern: field in your template corresponds to the parent folder in your dataset's directory. Each subsequent - pattern field in your template walks down each level of folders within that top-level folder.

In general, you need to have a - pattern: field for each folder in your directory. This is because the template needs instructions for what to do at each folder in the directory.

Valid values for the - pattern:

  • Use a variable to set Flywheel labels for group, project, subject, session, acquisition based on the folder name
  • Skip that level of the directory by using regex: .*
  • Use [select](#) to set different upload instructions if there are multiple folders at the same level.
  • Use scan to pull out Flywheel labels from a filename instead of folder name or to validate DICOM files
- pattern: "{group}"
- pattern: "Anxiety Study"
# Sets the project label to Anxiety Study no matter what the folder 
# name is in your dataset 
- pattern: "{subject}"
- pattern: "anx_{session}" 
- pattern:"{acquisition}"
  packfile_type: zip

Select

Used to start an expression where you set parameters or logical operators for two folders at the same level of the directory.

You cannot nest a select statement underneath a select statement.

- pattern: "{group}"
- pattern: "{subject}"
- pattern: "{session}"
- pattern: "{acquisition}"
- select:
  - pattern: "*.dcm"
    packfile_type: dicom
  - pattern: .*

The above example packs up all files with the extension .dcm and compresses them into a zip file. The zip file is uploaded with as an acquisition with the acquisitionlabel.dicom.zip. All other files are ignored and not uploaded.

Scan

Scans can either be filename or dicom. Using scan is optional, but should be used if you are uploading DICOM data or if you want to parse a filename to use as a metadata label in Flywheel.

You will define the specific scan type below. Below is an example of a complete scan step in the profile:

- pattern: "{subject}"
- pattern: "{session}"
  scan:
    name: filename
    pattern: "{acquisition}.dcm"

Name

The name fields configures the type of scan.

Dicom

When the scan step is set to dicom, Flywheel reads through all the files within that step of the hierarchy. Flywheel then parses all files with the .dcm extension. If the file is not a valid DICOM file, the file is not uploaded, and the import stops by default. To determine if a file is valid DICOM, we look for a DICM string at byte 128.

However, you can use the -force-scan flag in your CLI command to parse all files as DICOM regardless of the DICM prefix and upload them to Flywheel.

- pattern: "wimrpetct{subject}_{session}"
- pattern: ".*"
- pattern: "{acquisition}"
  scan: 
    name: dicom

Filename

Use the filename scan to parse the file names within that step of the directory. This allows you to pull out relevant parts of a filename to create labels and add metadata.

When used in combination with regex, you can loop through all files and use the same piece of the file name string from the files. For example, let's say that all of your images files have been named using the following naming scheme:

[date]_[study ID]_[subject number]_[acquisition number] . The file names would look something like this:

  • 20120215_2340_SUBJ1_acq2.dcm
  • 20120215_2340_SUBJ1_acq3.dcm
  • 20120215_2340_SUBJ1_acq4.dcm
  • 20120215_2340_SUBJ1_acq5.dcm
  • etc.

Use the piece of the filename representing the acquisition number acq2, acq3, acq4, to set the acquisition label in Flywheel. To do this for all files the folder we can add regex pattern matching along with the Flywheel field name in brackets <>.

- pattern: "{project}" 
- pattern: "{subject}"
- pattern: "{session}"
  scan:
    name: filename
    pattern:"^(?:[^_]*_){3}(?P<acquisition>[^.]*)"

Regex can quickly become complex. You should try out your regex before adding it to your template. See regex101 test out regex.

packfile_type

Groups all files within that level of the the directory, compresses them as a single zip file, and uploads them as an acquisition. You can specify a packfile type for the value. However, it is important to note that this setting does not validate the type of file before adding to the zip. The packfile is added to your acquisition label and becomes the type in the acquisition metadata.

- pattern: "development"
- pattern: "Example"
- pattern: "{subject}" 
- pattern: "{session}" 
- pattern: "{acquisition}"
  packfile_type: png

This would result in all files to be uploaded as {acquisistion}.png.zip

packfile_name

Overrides the default packfile name. Do not include quotes around the name.

- pattern: "development"
- pattern: "Emily Example"
- pattern: "{subject}" 
- pattern: "{session}" 
- pattern: "{acquisition}"
  packfile_type: dicom
  packfile_name: Historical_data

The example above would change the name of the packfile from {acquisistion}.dicom.zip to Historical_data.dicom.zip

Variables for Configuring Flywheel Metadata

The following are the variables used in the template file for Flywheel labels. Use the template variable to map all or part of a file or folder name to the equivalent Flywheel metadata field:

Template variable Flywheel field
{group}* group._id
{project}* project.label
{subject} subject.label
{session} session.label
{acquisition} acquisition.label

*While you can use the {group} and {project} variable in your template, whatever you use for the group and project in your command will override whatever is in the template.

Additional Flywheel Metadata that Can Be Configured

Groups: group.label

Projects: project.id

Subjects: subject._id

Sessions: session._id, session.uid, session.timestamp

Acquisitions: acquisition._id, acquisition.uid, acquisition.timestamp

Use the following format to assign these fields if you are not using regex:

- pattern: "{subject._id}"

Set Custom Metadata

You can also set custom metadata in the template. Custom metadata can help you create data views or run analysis. Custom metadata fields following this naming convention: [container].info.[fieldName]

For example, If a custom metadata field called RedCapID applies to subjects, the field name would look like subject.info.RedCapID. One exampleTo assign this custom metadata:

- pattern: "{subject}_(?P<subject.info.RedCapID>.*)"