Bulk Import
Flywheel's Bulk Import feature was designed from the ground up for scale, flexibility, and ease of use.
Bulk Import supports Terabyte-scale datasets containing large files, provides powerful tools for handling data duplication, and presents simple interfaces for monitoring and auditing import jobs.
Tip
Bulk Import is planned replace both Bulk Ingest and DICOM Upload.
Flywheel recommends switching to Bulk Import where possible for a better overall experience.
Limitations
Flywheel is eager to make Bulk Import available for broad use and is building out its capabilities iteratively to address more use cases with each release.
Bulk Import will eventually replace both Bulk Ingest and DICOM Upload, but these features will continue to be supported during the transition.
Although planned for future releases, Bulk Import currently does not support the following features at this time:
- De-identification (anonymization)
- Parsing data formatted using BIDS
If any of these features are required for your use case, consider using Bulk Ingest instead.
Maximum File Size with Bulk Import
Although the maximum individual file size for Bulk Import is designed to be virtually unlimited, it is dependent upon the resources available on your Flywheel site.
This limit can be increased as large as necessary by adding additional disk space to your Flywheel site.
By default, Flywheel sites are sized to allow individual files up to approximately 50 GB in size to be transferred using Bulk Import simply to avoid unnecessary hosting costs.
Contact Flywheel Support if you would like this limit increased for your site.
Quick Start
The Bulk Import feature is accessed either via the Imports tab of the Project page.
Follow the How to Run a Bulk Import guide to learn how to start, monitor, and troubleshoot a Bulk Import.
Data Sources
Flywheel supports importing data from any of the following sources:
Data Source | Availability |
---|---|
Your local computer | Web Browser |
Cloud object storage (AWS, Azure, GCP) | Web Browser, New CLI |
A networked file system | Web Browser, New CLI |
Cloud object storage is supported from the three leading cloud service providers:
- Amazon Simple Storage Service (S3)
- Google Cloud Storage
- Microsoft Azure Blob Storage
Importing data from a networked file system is the least preferred option, because it requires additional infrastructure-level configuration and maintenance. The Networked File System option is typically used when the source data is already stored on a networked file system (NFS).
Tip
If importing from a File System is required and cloud storage cannot be used, contact Flywheel Support to get started.
The file system containing the source data will need to be attached to the Flywheel server by Flywheel Operations staff.
The new (BETA) CLI provides extensive filter options for controlling exactly which files will be imported from the source data location.
Mapping Rules
The way the source data is organized within the storage bucket affects how the data is imported into Flywheel (a.k.a. "mapping").
For more details about the various options available for mapping source data to the Flywheel hierarchy, see the Bulk Import - Mapping to the Flywheel Hierarchy guide.
The new (BETA) CLI provides extensive options for controlling how the source data will be mapped to the Flywheel Hierarchy.
Duplicate Handling
The Bulk Import system performs duplicate detection automatically and surfaces duplicate scenarios to a Data Manager for review and resolution before placing the affected data into Flywheel Core.
For more details about this feature, see the Duplicate Handling documentation.
Reference-in-Place Import
To minimize cloud costs, files may be imported into Flywheel by reference only ("reference-in-place import"), avoiding copying. When this feature is used, files are not copied into Flywheel for storage and are kept in the source location. Instead, Flywheel only reads the source files to extract metadata and stores a reference, thus treating the source location as primary storage.
For more details about this feature, see the Import by Reference documentation.