Skip to content

Pattern Syntax Reference

This reference guide documents the custom pattern syntax used in Flywheel's Bulk Import and Export filtering and mapping rules. This syntax provides a simpler, more intuitive alternative to full regular expressions for most common use cases.

Overview

Flywheel's pattern syntax is designed to be familiar to users of shell glob patterns while providing powerful filtering capabilities. The system supports multiple types of patterns:

  • Simple patterns: Basic string matching with wildcards
  • Field-based patterns: Filter specific metadata fields
  • Glob-like patterns: File path matching with wildcard support
  • Regular expressions: Full regex support when needed

Understanding Metadata

Before using metadata fields in patterns, it's helpful to understand how metadata works in Flywheel. See Understanding Metadata in Flywheel for a comprehensive overview of metadata types, structure, and usage.

Complex Configurations: Rule Files

For complex import/export scenarios with multiple sets of rules, see Rule Files to learn how to define rules in a reusable YAML format.

Basic Pattern Structure

All filter expressions follow this structure:

field_name operator pattern_value

For example:

1
2
3
path=~*.dcm
name!=temp
size>1MB

Operators

The following operators are supported:

Operator Description Example
= Exact match (case-insensitive) name=patient01
!= Does not match name!=temp
< Less than size<10MB
> Greater than size>1GB
<= Less than or equal depth<=3
>= Greater than or equal depth>=2
=~ Pattern match (glob-like) path=~*.dcm
!~ Does not match pattern path!~*temp*

Wildcard Patterns

The pattern syntax supports several wildcard characters that are simpler than full regular expressions:

Single Asterisk (*)

Matches any sequence of characters except forward slashes (/). This is ideal for matching filenames or single directory levels.

Examples:

  • *.dcm - Matches any file ending in .dcm
  • patient* - Matches any string starting with patient
  • *001* - Matches any string containing 001

Double Asterisk (**/)

Matches any sequence of characters including forward slashes. This allows matching across multiple directory levels.

Examples:

  • **/*.dcm - Matches .dcm files at any depth
  • data/**/images - Matches images directory anywhere under data

Literal Dot (.)

In patterns, dots are treated as literal characters (not regex wildcards).

Examples:

  • file.txt - Matches exactly file.txt
  • *.dcm - Matches files ending in .dcm

Optional Sections ([...])

Square brackets make parts of the pattern optional.

Examples:

  • patient[_001] - Matches patient or patient_001
  • scan[/series]* - Matches scan* or scan/series*

DICOM UID Matching (\uid)

Special pattern for matching DICOM UIDs with optional modality prefixes.

Examples:

  • \uid - Matches valid DICOM UIDs like 1.3.6.1.4.1.14519.5
  • MR.\uid - Matches modality-prefixed UIDs like MR.1.3.6.1.4.1.14519.5

Field Types

Different field types support different operators and value formats:

String Fields

Supported operators: =, !=, <, >, <=, >=, =~, !~

String matching is case-insensitive by default.

Examples:

1
2
3
path=~patient*/study*/series*/*.dcm
name!=temp
ext=dcm

Numeric Fields

Supported operators: =, !=, <, >, <=, >=

Examples:

1
2
3
size>10MB
depth=4
age>=18

Boolean Fields

Supported operators: =, !=

Accepts true/false, 1/0 (case-insensitive).

Examples:

processed=true
valid!=false

Size Fields

Accepts human-readable size formats with units.

Supported units: B, KB, MB, GB, TB, PB

Examples:

1
2
3
size>1MB
size<=500KB
size=1.5GB

Date/Time Fields

Supports ISO date/time formats and partial dates.

Examples:

1
2
3
created>=2023-01-01
modified>2023-06-15T10:30:00
date=2023-12

Set/Array Fields

For fields containing multiple values (arrays), operators check if ANY value matches.

Examples:

tags=research
modalities=~MR

Regular Expression Mode

For advanced users, full regular expression support is available by appending !r to the pattern:

path=~^/data/[0-9]{4}/.*\.dcm$!r

Case-insensitive regex matching can be enabled with !i:

name=~^patient.*!i

These can be combined:

path=~^/DATA/.*\.DCM$!ri

Field Names

These are fields available for use in import and export rules:

File System Fields

  • path - Full file path
  • name - File name only
  • ext - File extension (e.g., png, dcm)
  • dir - Parent directory name
  • depth - Directory depth at which the file is located
  • size - File size in bytes
  • ctime - File created timestamp
  • mtime - File modified timestamp

Flywheel Metadata Fields

Applicability:

  • Import Operations:
    • Filters: Cannot use Flywheel metadata fields (only file system fields available)
    • Mappings: Cannot use Flywheel metadata fields directly, but can use:
      • File system information (path, name, size, etc.)
      • DICOM headers (read directly from files during import)
      • Previously-derived values from earlier mapping rules (see Recursive Mappings)
  • Export Operations:
    • Filters: Can use all Flywheel metadata fields listed below
    • Mappings: Can use all Flywheel metadata fields for constructing destination paths

Important: During imports, files are not yet in Flywheel, so Flywheel metadata does not exist. During exports, file contents are not read, so only metadata already stored in Flywheel can be used.

For a comprehensive understanding of how metadata works in Flywheel, see Understanding Metadata in Flywheel.

Project Fields

  • project._id - Project unique identifier
  • project.label - Project name

Subject Fields

  • subject._id - Subject unique identifier
  • subject.label - Subject identifier
  • subject.firstname - Subject first name
  • subject.lastname - Subject last name
  • subject.sex - Subject sex
  • subject.mlset - Machine learning dataset designation
  • subject.info.* - Custom metadata (use specific keys, e.g., subject.info.study_group)
  • subject.tags - Subject tags

Session Fields

  • session._id - Session unique identifier
  • session.uid - Session UID (e.g., Study Instance UID)
  • session.label - Session identifier
  • session.age - Subject age at time of session (in seconds)
  • session.weight - Subject weight
  • session.operator - Session operator name
  • session.timestamp - Session timestamp
  • session.info.* - Custom metadata (use specific keys, e.g., session.info.scanner_model)
  • session.tags - Session tags

Acquisition Fields

  • acquisition._id - Acquisition unique identifier
  • acquisition.uid - Acquisition UID (e.g., Series Instance UID)
  • acquisition.label - Acquisition name
  • acquisition.timestamp - Acquisition timestamp
  • acquisition.info.* - Custom metadata (use specific keys, e.g., acquisition.info.protocol)
  • acquisition.tags - Acquisition tags

File Fields

  • file.name - File name
  • file.type - File type (e.g., dicom, nifti)
  • file.modality - File modality
  • file.size - File size in bytes
  • file.info.* - Custom metadata (use specific keys, e.g., file.info.quality_check)
  • file.tags - File tags
  • file.classification - File classification object
  • file.classification.* - Specific classification fields (use specific keys)

DICOM Metadata

For Import Operations:

During imports, DICOM headers can be read directly from DICOM files as they are being processed. Import mapping rules can reference DICOM tags using their standard names (e.g., PatientID, StudyDescription, Modality).

For Export Operations:

Exports cannot read file contents directly. To use DICOM metadata in export filters, the DICOM headers must first be extracted into Flywheel metadata by running the File Metadata Importer gear.

  • Extraction Location: The File Metadata Importer gear stores DICOM headers at file.info.header.dicom.*
  • Gear-Dependent: The exact metadata keys depend on the gear's logic and may vary across gear versions
  • Not Native: Flywheel does not natively extract DICOM headers; this must be done via gears

Common DICOM Fields Available After Extraction:

The following list shows commonly-used DICOM fields as they appear in Flywheel after extraction by the File Metadata Importer gear:

  • file.info.header.dicom.PatientID - Patient ID
  • file.info.header.dicom.PatientName - Patient Name
  • file.info.header.dicom.StudyInstanceUID - Study Instance UID
  • file.info.header.dicom.SeriesInstanceUID - Series Instance UID
  • file.info.header.dicom.Modality - Imaging modality (MR, CT, etc.)
  • file.info.header.dicom.SeriesDescription - Series description
  • file.info.header.dicom.StudyDescription - Study description
  • file.info.header.dicom.AcquisitionDate - Scan date
  • file.info.header.dicom.ImageComments - Image comments

Learn more about DICOM metadata extraction

Flywheel Hierarchy Fields

Flywheel hierarchy fields are available for use with filters only in Exports and not in Imports. These fields are available for use with mappings in Imports.

  • project.label - Project name
  • subject.label - Subject identifier
  • session.label - Session identifier
  • acquisition.label - Acquisition name

Escaping Special Characters

To match literal special characters, escape them with backslashes:

Examples:

1
2
3
path=~file\[1\].txt    # Matches literal "file[1].txt"
name=~test\*file       # Matches literal "test*file"
path=~dir\.old         # Matches literal "dir.old"

Recursive Mappings (Previously-Derived Values)

In import mappings, you can create intermediate values that can then be referenced in subsequent mapping rules. This allows you to build complex metadata values step by step.

How It Works:

  1. Create a custom key by assigning a value in one mapping rule
  2. Reference that custom key in later mapping rules using the same {keyname} syntax

Example:

1
2
3
4
5
6
7
8
# First, create a composite identifier
{myCompositeKey}="{Modality}_{StudyDate}_{name}"

# Then use it to set the acquisition label
acquisition.label={myCompositeKey}_{StudyDescription}

# And store it as custom metadata
file.info.myCompositeKey={myCompositeKey}

In this example:

  1. {myCompositeKey} is calculated as MR_20240315_scan001.dcm
  2. acquisition.label becomes MR_20240315_scan001.dcm_Brain Protocol
  3. file.info.myCompositeKey stores the intermediate value for future reference

Use Cases:

  • Building complex labels from multiple DICOM fields
  • Normalizing data across different naming conventions
  • Creating reusable intermediate values for cleaner mapping rules
  • Storing derived metadata alongside source metadata

Execution Order

Mapping rules are evaluated in the order they are specified. Ensure you define a custom key before referencing it in subsequent rules.

Pattern Examples

Basic File Filtering

# Include only DICOM files
path=~*.dcm

# Exclude temporary files
name!~*tmp*

# Files larger than 1MB
size>1MB

# Files in specific depth
depth=3

Directory Structure Patterns

1
2
3
4
5
6
7
8
# Match files in patient directories
path=~patient*/*

# Match at any depth under data folder
path=~data/**/*

# Specific pattern with optional parts
path=~study[_pilot]/patient*/scan*.dcm

DICOM-Specific Patterns

For Export Operations (using extracted DICOM metadata):

1
2
3
4
5
6
7
8
# Files with DICOM UIDs in name
name=~\uid

# MRI DICOM files (must extract metadata first)
file.info.header.dicom.Modality=MR

# Specific study (must extract metadata first)
file.info.header.dicom.StudyInstanceUID=1.3.6.1.4.1.14519.5.2.1

For Import Mappings (reading DICOM headers directly):

1
2
3
4
5
6
# Map MRI files using modality tag
acquisition.label={Modality}_{SeriesDescription}

# Conditionally process based on study description
subject.label={PatientID}
session.label={StudyDate}

Complex Combinations

Export Filter Examples:

1
2
3
4
5
6
7
8
# Large MRI files from 2023 (requires extracted DICOM metadata)
size>10MB AND file.info.header.dicom.Modality=MR AND created>=2023-01-01

# Non-localizer series (requires extracted DICOM metadata)
file.info.header.dicom.SeriesDescription!~*localizer*

# Research data excluding test subjects (using Flywheel metadata)
project.label=~research* AND subject.label!~test*

Best Practices

  1. Start simple: Begin with basic patterns and add complexity as needed
  2. Test patterns: Use the test or dry-run modes to verify patterns match expected files
  3. Use specific patterns: More specific patterns improve performance and accuracy
  4. Escape when needed: Remember to escape literal special characters
  5. Consider case sensitivity: String matching is case-insensitive by default
  6. Use appropriate operators: Choose the right operator for your data type

Pattern Testing

Before running imports or exports, test your patterns using the test or dry-run modes.

For imports, the import test command shows exactly how the patterns will be applied and what metadata will be extracted.

fw-beta import test /path/to/file.dcm --include 'path=~*.dcm' --exclude 'name=~*temp*'

For both imports and exports, the --dry-run option shows a preview of how the patterns will be applied to a small subset of the data. Dry-run mode performs a simulated import or export without making any actual changes, then presents a summary report of how the data would be processed. Dry-run mode does not process the full dataset but only a small subset.