Skip to content

Bulk Import - Mapping to the Flywheel Hierarchy

The way the source data is organized within the storage bucket affects how the data is imported into Flywheel (a.k.a. "mapping").

There are several options for mapping rules:

  1. Default Mapping Rules
    1. DICOM header-based
    2. File path-based
  2. User-defined rules using file paths and/or DICOM headers

Default Mapping Rules

DICOM header-based mappings

If the source data consists only of DICOM files, then a simple approach is to use the default "DICOM header"-based mapping rules, which allows for an even simpler way of organizing files:

  • The "root" folder represents a single project
  • Each "leaf" (lowest-level) folder represents a single Acquisition
    • Each acquisition folder contains in a single "leaf" folder
    • Each "leaf" folder contains a single acquisition

Note that there are no specific rules around the intermediate-level folders -- just that each Acquisition needs to be in its own folder.

For example, consider the following source data structure:

- s3://myDataBucket
    - /Patient123
        - /Study20220423
            - /Series1
                - file1.dcm
                - file2.dcm
        - /Study20230122
    - /Patient456
        - file3.dcm
        - file4.dcm

With the default DICOM header-based mapping rules, this source data would be importing into Flywheel as:

  • (Destination Project)
    • Subject
      • Session
        • Acquisition
          • File: Acq1.dicom.zip
    • Subject
      • Session
        • Acquisition
          • File: Acq2.dicom.zip

Where the ZIP file contents are:

  • Acq1.dicom.zip
    • file1.dcm
    • file2.dcm
  • Acq2.dicom.zip
    • file3.dcm
    • file4.dcm

Note a few things:

  • Container Labels: The container labels are derived form the DICOM header information, not from the folder names
  • Zipping: The files are grouped together and are stored in Flywheel as ZIP files
    • ZIP File Name: The ZIP files are derived from the DICOM header information, not from the folder names
  • Only DICOM files: Only DICOM files are imported -- all other types of files are ignored

File path-based mappings

If the source data contains any other types of files beyond just DICOM, then the DICOM header-based mappings cannot be used.

In this case, the simplest approach is to use the default "file path"-based mapping rules, which requires the data to be organized accordingly:

  • The "root" folder represents a single project
  • Each first-level folder (directly inside the Project root) represents a single Subject (i.e., Patient)
  • Each second-level folder (directly inside the Subject) represents a single Session (i.e., Study)
  • Each "leaf" (lowest-level) folder (directly inside a Session) represents a single Acquisition
    • Each acquisition folder contains in a single "leaf" folder
    • Each "leaf" folder contains a single acquisition

For example, consider the following source data structure:

- s3://myDataBucket
    - /Patient123
        - /Study20220423
            - /Series1
                - formA.pdf
                - report09.csv
                - ...
        - /Study20230122
        - ...
    - /Patient456
        - /Study20221103
    - /...

With the default file path-based mapping rules, this source data would be importing into Flywheel as:

  • (Destination Project)
    • Subject: "Patient123"
      • Session: "Study20220423"
        • Acquisition: "Series1"
          • File: formA.pdf
          • File: report09.csv
          • ...
      • Session: "Study20230122"
      • ...
    • Subject: "Patient456"
      • Session: "Study20221103"
    • ...

Note a few nuances:

  • Container Labels: The source folder names are used as the container labels (e.g., "Patient123" is the Subject label).
  • No Zipping: The files are not grouped together and are stored in Flywheel individually as-is.
  • Arbitrary Files Types: Any type of file can be imported (not only DICOM) so long as the file type is supported by Flywheel Core.

User-defined Mapping Rules

More information about user-defined mapping options can be found in the new (BETA) CLI docs for the import run command.