Data Views Available Fields and Technical Specifications
Data Views allows you to create tabular data sets from image file metadata in the Flywheel database. It's important to understand what fields are available and the technical capabilities of the data view feature to use it appropriately. This document provides a detailed reference of the fields and technical specifications of data views.
Data Fields Available using the SDK
Fields are available from the Project, Subject, Session, Acquisition and File containers in the Flywheel data hierarchy. In addition, fields are available from the Analysis containers. Using the SDK you can access all the fields in the hierarchy. From the UI it is possible to access a subset of the fields that are also available from the Search functionality.
Specific Container Data Fields Available using the UI
The fields available for selection at a particular site in the UI match the ones available on the Search page for Metadata Fields. Below is a list of the most commonly used fields. The best way to check if a field is available is to search for it.
Common Fields Across All Containers
For all the containers listed above, the following fields are available:
container.id
container.created
container.modified
container.info
In the above, replace the word container with the actual container, for example acquisition.id, to identify the field.
Additional Commonly used fields by Container in the Data Hierarchy
For each container in the hierarchy, the available fields are as follows:
- Project Container
project.label
project.tags
project.description
- Subject Container
subject.label
subject.tags
subject.mlset
subject.sex
subject.code
subject.cohort
subject.strain
subject.ethnicity
subject.race
subject.firstname
subject.lastname
subject.species
subject.type
subject.notes
- Session Container
session.label
session.tags
session.weight
session.operator
session.timezone
session.timestamp
session.age_days
session.age_weeks
session.age_months
session.age_years
session.url
session.notes
- Acquisition Container
acquisition.label
acquisition.tags
acquisition.timestamp
acquisition.timezone
- File Container
file.name
file.file_id
file.size
file.type
file.version
file.mimetype
file.classification
file.classification_list
file.classification.Custom
file.classification.Intent
file.classification.Features
file.classification.Measurement
file.gear_info
Analysis Container Fields
Analysis containers can occur in the Project, Subject, Session, and Acquisition containers. The following fields are available:
analysis.label
analysis.description
analysis.job
analysis.gear_info
analysis.notes
analysis.parents.acquisition
analysis.parents.session
analysis.parents.subject
Custom Fields (.info
subfields)
The .info
subfields are custom defined and may return as structured arrays in the data view results. If a custom field called xyz
is defined for a container, then this can be queried in a data view by:
container.info.xyz
where the word 'container' is one of the containers that have a .info
field available in the lists above.
If the data is represented as a structured array, the array is returned in the data view result. For example:
{"AcquisitionDate":"20170906", "AcquisitionMatrix":[0,160,160,0], "AcquisitionNumber":1, "AcquisitionTime":"081330.532500"}
This also applies to the viewer annotations data stored in the sessions.info
field.
Gear Info Subfields
The following subfields are available for file.gear_info
and analysis.gear_info
fields:
gear_info.name
gear_info.version
gear_info.id
gear_info.category
Notes subfields
Containers that have the .notes
field have the following subfields:
notes.text
notes.user
notes.created
notes.modified
For these fields array data is returned containing the multiple notes.
Tip
Use the Flywheel Search page for Metadata Fields to identify whether a field is available for a data view.
Data View Row Details
The default order of rows returned by a data view is based on the id of the lowest level container. The data view is guaranteed to return all rows that match the criteria in the data view or return an error. So if the data view executes successfully, you can count on the result having all the data in Flywheel that matches the criteria.
Data Views Technical Specifications
The Data Views query capabilities are:
- Return tabular data based on simple filtering and column selection criteria. It is not meant to be a generic database query tool. For example, it does not support a query language with complex queries.
- Query results are atomic at the row level, meaning a row of data represents a collection of data that exists (or did exist) in the database together at the moment of the query. The complete query result consisting of multiple rows is not atomic, meaning that if the data changes in the database while the query is running, the data view result could contain rows reflecting data before the change as well as rows reflecting data after the change.
- The query is performance (both time required to perform the query and ability to complete the query successfully) depends on infrastructure and has limitations. Data views is tested to return 200K rows in under 15 minutes for a limited set of test queries and adequate infrastructure resources. Data views in the 1M - 6M rows are possible, with adequate time (multiple hours) and adequate resources.
- Data views is designed with both an SDK interface and a UI (graphical) interface. The SDK interface provides additional capabilities for data views not provided in the UI interface. These additional features are meant for technical users.
- A limited set of columns from the database are available for querying using the SDK. In scope: Containers and files at the Acquisition, Session, Subject level. Analysis containers (excluding files), Project containers. Out of scope: Project files, Analysis files (Browser), and any other data not mentioned here. In the UI you can use the fields mentioned here, as well as any that come up in the Flywheel Search. The SDK allows selecting additional fields of these containers.