Compute
Overview
Flywheel provides two complementary approaches for running computations on your imaging data. Each method serves different use cases and offers distinct advantages for reproducibility, automation, and flexibility.
Compute Options
Gears - Containerized Batch Processing
Gears are containerized applications that run automated, reproducible data processing pipelines. They package analysis tools, dependencies, and execution logic into self-contained units that can run on-demand or automatically via rules.
Key Characteristics:
- Fully reproducible - Containers ensure identical execution environments across runs
- Automated workflows - Trigger automatically via gear rules when data arrives
- Batch processing - Run the same analysis on multiple datasets simultaneously
- Version controlled - Track which gear version produced each result
- Curated collection - Pre-built gears available via the Gear Exchange
- No interactive access - Execute and complete without user intervention
Common Use Cases:
- DICOM to NIfTI conversion and file preprocessing
- Quality control checks (MRIQC) on incoming data
- Standard analysis pipelines (FreeSurfer, fMRIPrep, dcm2niix)
- Automated classification and metadata extraction
- Reproducible scientific analyses for publications
Reproducibility Benefits:
- Container images lock all dependencies to specific versions
- Gear configuration is captured with each job execution
- Input files and outputs are tracked in the Flywheel hierarchy
- Rerunning the same gear version with same inputs produces identical results
- Audit trail shows exactly what was run, when, and by whom
Notebooks - Interactive Computing Environment
Flywheel Notebooks provide interactive JupyterLab workspaces integrated with the Flywheel platform. Notebooks combine Python code, visualizations, and documentation in a single interactive environment.
Key Characteristics:
- Interactive exploration - Write and execute code iteratively with immediate feedback
- Flexible workflows - Combine data access, analysis, visualization, and reporting
- Python ecosystem - Access to popular data science libraries (pandas, numpy, scikit-learn)
- SDK integration - Direct access to Flywheel data via the Python SDK
- Managed compute - Configurable resources (CPU, memory, GPU) managed by Flywheel
- Workspace persistence - Files and notebooks saved across sessions
Common Use Cases:
- Data exploration and quality assessment
- Custom data curation and metadata management
- Machine learning model development and training
- Interactive visualization and reporting
- Prototype development before converting to gears
- Administrative automation tasks
Reproducibility Considerations:
- Less deterministic than gears - Interactive execution can vary between runs
- Manual dependency management - Python packages must be explicitly tracked
- Environment drift - Workspace environments can change over time
- Code versioning required - Use git or similar tools to track notebook changes
- Best practice: Export key workflows as gears for production reproducibility
Optional Module
Flywheel Notebooks is an optional module provided by Flywheel for Machine Learning in Medical Imaging. Contact your Flywheel Account Executive or Flywheel support for more information.
Choosing the Right Compute Approach
| Consideration | Gears | Notebooks |
|---|---|---|
| Reproducibility | High - Fully containerized and versioned | Moderate - Requires manual tracking |
| Automation | Yes - Via gear rules and batch processing | No - Manual execution |
| Flexibility | Low - Fixed workflow defined in gear | High - Interactive code development |
| Learning Curve | Low - Select and run pre-built tools | Moderate - Python programming required |
| Version Control | Built-in gear versioning | Manual git/SDK tracking needed |
| Use Case | Production pipelines and standard analyses | Exploratory analysis and development |
Recommended Workflow
- Start with Notebooks for exploratory analysis and prototype development
- Validate findings and refine code interactively in the notebook environment
- Convert to Gears for production workflows that need automation and reproducibility
- Use Gear Rules to automatically apply validated workflows to new data
- Return to Notebooks when new analyses or visualizations are needed
Best Practices
For Gears
- Test thoroughly before enabling gear rules on production data
- Document configurations in project notes or gear job tags
- Use gear versioning to maintain consistent results across studies
- Monitor the Jobs Log for failures and adjust configurations as needed
- Apply batch processing when reprocessing multiple datasets
- Leverage automation with gear rules for routine preprocessing tasks
For Notebooks
- Track dependencies explicitly (e.g., via requirements.txt or conda environments)
- Version control notebooks using git or the Flywheel SDK
- Document assumptions and parameters within notebook markdown cells
- Export critical code as Python modules or gears for reuse
- Test reproducibility by restarting kernel and running all cells
- Save intermediate results to Flywheel using the SDK for traceability
Reproducibility Guidelines
For Published Research:
- Use gears for all analyses included in publications
- Document exact gear versions and configurations
- Preserve input data and gear outputs in Flywheel projects
- Consider creating custom gears for novel analysis methods
For Internal Workflows:
- Start with notebooks for initial development
- Transition to gears once workflows stabilize
- Use notebooks for one-off analyses and reporting
- Apply gears for recurring automated tasks
For Machine Learning:
- Develop models interactively in notebooks
- Track training data, parameters, and results
- Package trained models as gears for inference
- Version both notebooks and gear implementations