Pattern Syntax Reference

This reference guide documents the custom pattern syntax used in Flywheel's Bulk Import and Export filtering and mapping rules. This syntax provides a simpler, more intuitive alternative to full regular expressions for most common use cases.

Overview

Flywheel's pattern syntax is designed to be familiar to users of shell glob patterns while providing powerful filtering capabilities. The system supports multiple types of patterns:

Simple patterns: Basic string matching with wildcards
Field-based patterns: Filter specific metadata fields
Glob-like patterns: File path matching with wildcard support
Regular expressions: Full regex support when needed

Understanding Metadata

Before using metadata fields in patterns, it's helpful to understand how metadata works in Flywheel. See Understanding Metadata in Flywheel for a comprehensive overview of metadata types, structure, and usage.

Complex Configurations: Rule Files

For complex import/export scenarios with multiple sets of rules, see Rule Files to learn how to define rules in a reusable YAML format.

Basic Pattern Structure

All filter expressions follow this structure:

1	`field_name operator pattern_value`

For example:

1
2
3

path=~*.dcm
name!=temp
size>1MB

Operators

The following operators are supported:

Operator	Description	Example
`=`	Exact match (case-insensitive)	`name=patient01`
`!=`	Does not match	`name!=temp`
`<`	Less than	`size<10MB`
`>`	Greater than	`size>1GB`
`<=`	Less than or equal	`depth<=3`
`>=`	Greater than or equal	`depth>=2`
`=~`	Pattern match (glob-like)	`path=~*.dcm`
`!~`	Does not match pattern	`path!~temp`

Wildcard Patterns

The pattern syntax supports several wildcard characters that are simpler than full regular expressions:

Single Asterisk (`*`)

Matches any sequence of characters except forward slashes (/). This is ideal for matching filenames or single directory levels.

Examples:

*.dcm - Matches any file ending in .dcm
patient* - Matches any string starting with patient
*001* - Matches any string containing 001

Double Asterisk (`**/`)

Matches any sequence of characters including forward slashes. This allows matching across multiple directory levels.

Examples:

**/*.dcm - Matches .dcm files at any depth
data/**/images - Matches images directory anywhere under data

Literal Dot (`.`)

In patterns, dots are treated as literal characters (not regex wildcards).

Examples:

file.txt - Matches exactly file.txt
*.dcm - Matches files ending in .dcm

Optional Sections (`[...]`)

Square brackets make parts of the pattern optional.

Examples:

patient[_001] - Matches patient or patient_001
scan[/series]* - Matches scan* or scan/series*

DICOM UID Matching (`\uid`)

Special pattern for matching DICOM UIDs with optional modality prefixes.

Examples:

\uid - Matches valid DICOM UIDs like 1.3.6.1.4.1.14519.5
MR.\uid - Matches modality-prefixed UIDs like MR.1.3.6.1.4.1.14519.5

Field Types

Different field types support different operators and value formats:

String Fields

Supported operators: =, !=, <, >, <=, >=, =~, !~

String matching is case-insensitive by default.

Examples:

1
2
3

path=~patient*/study*/series*/*.dcm
name!=temp
ext=dcm

Numeric Fields

Supported operators: =, !=, <, >, <=, >=

Examples:

1
2
3

size>10MB
depth=4
age>=18

Boolean Fields

Supported operators: =, !=

Accepts true/false, 1/0 (case-insensitive).

Examples:

1 2	`processed=true valid!=false`

Size Fields

Accepts human-readable size formats with units.

Supported units: B, KB, MB, GB, TB, PB

Examples:

1
2
3

size>1MB
size<=500KB
size=1.5GB

Date/Time Fields

Supports ISO date/time formats and partial dates.

Examples:

1
2
3

created>=2023-01-01
modified>2023-06-15T10:30:00
date=2023-12

Set/Array Fields

For fields containing multiple values (arrays), operators check if ANY value matches.

Examples:

1 2	`tags=research modalities=~MR`

Regular Expression Mode

For advanced users, full regular expression support is available by appending !r to the pattern:

1	`path=~^/data/[0-9]{4}/.*\.dcm$!r`

Case-insensitive regex matching can be enabled with !i:

1	`name=~^patient.*!i`

These can be combined:

1	`path=~^/DATA/.*\.DCM$!ri`

Field Names

These are fields available for use in import and export rules:

File System Fields

path - Full file path
name - File name only
ext - File extension (e.g., png, dcm)
dir - Parent directory name
depth - Directory depth at which the file is located
size - File size in bytes
ctime - File created timestamp
mtime - File modified timestamp

Flywheel Metadata Fields

Applicability:

Import Operations:
- Filters: Cannot use Flywheel metadata fields (only file system fields available)
- Mappings: Cannot use Flywheel metadata fields directly, but can use:
  - File system information (path, name, size, etc.)
  - DICOM headers (read directly from files during import)
  - Previously-derived values from earlier mapping rules (see Recursive Mappings)
Export Operations:
- Filters: Can use all Flywheel metadata fields listed below
- Mappings: Can use all Flywheel metadata fields for constructing destination paths

Important: During imports, files are not yet in Flywheel, so Flywheel metadata does not exist. During exports, file contents are not read, so only metadata already stored in Flywheel can be used.

For a comprehensive understanding of how metadata works in Flywheel, see Understanding Metadata in Flywheel.

Project Fields

project._id - Project unique identifier
project.label - Project name

Subject Fields

subject._id - Subject unique identifier
subject.label - Subject identifier
subject.firstname - Subject first name
subject.lastname - Subject last name
subject.sex - Subject sex
subject.mlset - Machine learning dataset designation
subject.info.* - Custom metadata (use specific keys, e.g., subject.info.study_group)
subject.tags - Subject tags

Session Fields

session._id - Session unique identifier
session.uid - Session UID (e.g., Study Instance UID)
session.label - Session identifier
session.age - Subject age at time of session (in seconds)
session.weight - Subject weight
session.operator - Session operator name
session.timestamp - Session timestamp
session.info.* - Custom metadata (use specific keys, e.g., session.info.scanner_model)
session.tags - Session tags

Acquisition Fields

acquisition._id - Acquisition unique identifier
acquisition.uid - Acquisition UID (e.g., Series Instance UID)
acquisition.label - Acquisition name
acquisition.timestamp - Acquisition timestamp
acquisition.info.* - Custom metadata (use specific keys, e.g., acquisition.info.protocol)
acquisition.tags - Acquisition tags

File Fields

file.name - File name
file.type - File type (e.g., dicom, nifti)
file.modality - File modality
file.size - File size in bytes
file.info.* - Custom metadata (use specific keys, e.g., file.info.quality_check)
file.tags - File tags
file.classification - File classification object
file.classification.* - Specific classification fields (use specific keys)

DICOM Metadata

For Import Operations:

During imports, DICOM headers can be read directly from DICOM files as they are being processed. Import mapping rules can reference DICOM tags using their standard names (e.g., PatientID, StudyDescription, Modality).

For Export Operations:

Exports cannot read file contents directly. To use DICOM metadata in export filters, the DICOM headers must first be extracted into Flywheel metadata by running the File Metadata Importer gear.

Extraction Location: The File Metadata Importer gear stores DICOM headers at file.info.header.dicom.*
Gear-Dependent: The exact metadata keys depend on the gear's logic and may vary across gear versions
Not Native: Flywheel does not natively extract DICOM headers; this must be done via gears

Common DICOM Fields Available After Extraction:

The following list shows commonly-used DICOM fields as they appear in Flywheel after extraction by the File Metadata Importer gear:

file.info.header.dicom.PatientID - Patient ID
file.info.header.dicom.PatientName - Patient Name
file.info.header.dicom.StudyInstanceUID - Study Instance UID
file.info.header.dicom.SeriesInstanceUID - Series Instance UID
file.info.header.dicom.Modality - Imaging modality (MR, CT, etc.)
file.info.header.dicom.SeriesDescription - Series description
file.info.header.dicom.StudyDescription - Study description
file.info.header.dicom.AcquisitionDate - Scan date
file.info.header.dicom.ImageComments - Image comments

Learn more about DICOM metadata extraction

Flywheel Hierarchy Fields

Flywheel hierarchy fields are available for use with filters only in Exports and not in Imports. These fields are available for use with mappings in Imports.

project.label - Project name
subject.label - Subject identifier
session.label - Session identifier
acquisition.label - Acquisition name

Escaping Special Characters

To match literal special characters, escape them with backslashes:

Examples:

1
2
3

path=~file\[1\].txt    # Matches literal "file[1].txt"
name=~test\*file       # Matches literal "test*file"
path=~dir\.old         # Matches literal "dir.old"

Recursive Mappings (Previously-Derived Values)

In import mappings, you can create intermediate values that can then be referenced in subsequent mapping rules. This allows you to build complex metadata values step by step.

How It Works:

Create a custom key by assigning a value in one mapping rule
Reference that custom key in later mapping rules using the same {keyname} syntax

Example:

# First, create a composite identifier
{myCompositeKey}="{Modality}_{StudyDate}_{name}"

# Then use it to set the acquisition label
acquisition.label={myCompositeKey}_{StudyDescription}

# And store it as custom metadata
file.info.myCompositeKey={myCompositeKey}

In this example:

{myCompositeKey} is calculated as MR_20240315_scan001.dcm
acquisition.label becomes MR_20240315_scan001.dcm_Brain Protocol
file.info.myCompositeKey stores the intermediate value for future reference

Use Cases:

Building complex labels from multiple DICOM fields
Normalizing data across different naming conventions
Creating reusable intermediate values for cleaner mapping rules
Storing derived metadata alongside source metadata

Execution Order

Mapping rules are evaluated in the order they are specified. Ensure you define a custom key before referencing it in subsequent rules.

Pattern Examples

Basic File Filtering

# Include only DICOM files
path=~*.dcm

# Exclude temporary files
name!~*tmp*

# Files larger than 1MB
size>1MB

# Files in specific depth
depth=3

Directory Structure Patterns

# Match files in patient directories
path=~patient*/*

# Match at any depth under data folder
path=~data/**/*

# Specific pattern with optional parts
path=~study[_pilot]/patient*/scan*.dcm

DICOM-Specific Patterns

For Export Operations (using extracted DICOM metadata):

# Files with DICOM UIDs in name
name=~\uid

# MRI DICOM files (must extract metadata first)
file.info.header.dicom.Modality=MR

# Specific study (must extract metadata first)
file.info.header.dicom.StudyInstanceUID=1.3.6.1.4.1.14519.5.2.1

For Import Mappings (reading DICOM headers directly):

# Map MRI files using modality tag
acquisition.label={Modality}_{SeriesDescription}

# Conditionally process based on study description
subject.label={PatientID}
session.label={StudyDate}

Complex Combinations

Export Filter Examples:

# Large MRI files from 2023 (requires extracted DICOM metadata)
size>10MB AND file.info.header.dicom.Modality=MR AND created>=2023-01-01

# Non-localizer series (requires extracted DICOM metadata)
file.info.header.dicom.SeriesDescription!~*localizer*

# Research data excluding test subjects (using Flywheel metadata)
project.label=~research* AND subject.label!~test*

Best Practices

Start simple: Begin with basic patterns and add complexity as needed
Test patterns: Use the test or dry-run modes to verify patterns match expected files
Use specific patterns: More specific patterns improve performance and accuracy
Escape when needed: Remember to escape literal special characters
Consider case sensitivity: String matching is case-insensitive by default
Use appropriate operators: Choose the right operator for your data type

Pattern Testing

Before running imports or exports, test your patterns using the test or dry-run modes.

For imports, the import test command shows exactly how the patterns will be applied and what metadata will be extracted.

fw-beta import test /path/to/file.dcm --include 'path=~*.dcm' --exclude 'name=~*temp*'

For both imports and exports, the --dry-run option shows a preview of how the patterns will be applied to a small subset of the data. Dry-run mode performs a simulated import or export without making any actual changes, then presents a summary report of how the data would be processed. Dry-run mode does not process the full dataset but only a small subset.

Pattern Syntax Reference

Overview

Basic Pattern Structure

Operators

Wildcard Patterns

Single Asterisk (*)

Double Asterisk (**/)

Literal Dot (.)

Optional Sections ([...])

DICOM UID Matching (\uid)

Field Types

String Fields

Numeric Fields

Boolean Fields

Size Fields

Date/Time Fields

Set/Array Fields

Regular Expression Mode

Field Names

File System Fields

Flywheel Metadata Fields

Project Fields

Subject Fields

Session Fields

Acquisition Fields

File Fields

DICOM Metadata

Flywheel Hierarchy Fields

Escaping Special Characters

Recursive Mappings (Previously-Derived Values)

Pattern Examples

Basic File Filtering

Directory Structure Patterns

DICOM-Specific Patterns

Complex Combinations

Best Practices

Pattern Testing

Single Asterisk (`*`)

Double Asterisk (`**/`)

Literal Dot (`.`)

Optional Sections (`[...]`)

DICOM UID Matching (`\uid`)