Data Views
Flywheel captures data and metadata from imported data sets, processing results, and information added by users. A large portion of that is indexed and can be used in a Data View report. For example, you can create a demographics report for all of the subjects in your study or aggregate data from exams in a longitudinal study to assess changes over time.
Execute a Shared Data View to create a Report
Site admins have the ability to create site level Shared Data Views. Shared Data Views will automatically appear in each project. Users in that project can then run the report against their own project data to generate a data view– this allows you to standardize reporting throughout the site.
- Go to a project.
- Click Data View from the top menu.
-
Select the Shared Data Views tab.
-
Click the name of the Shared Data View. The Data View pulls data from this project to create the report.
- Click Export to download the report in CSV, TSV, JSON, JSON-Flat, or in JSON row-column format
Create a Data view for a project
A Data View can be defined at the project level. This allows you to standardize reporting within a project.
- Go to a project.
-
Click Data Views from the top menu.
-
Click Create New Data View
- Enter a name and a short description so other users know what type of report they are generating
- The list of fields in the Available Columns can be selected to add to the data view. Note that the available columns list includes Flywheel's default fields in this view. Additional fields, such as custom metadata, are available by using the 'Filter Available Columns' search. Also, Project level data can not be included in a Data View. A full description of what is available can be found here.
- Select the arrow to add the field to Selected Columns.
-
Click on a field in Selected Columns to edit the alias. The alias will be the column header in the report.
-
If you like you can get a preview of the tabular report by clicking Preview.
- Finally press Save to save the Data View.
Tip
If you are using 'Filter Available Columns' search, and you can't find the field you are looking for in the return list, you may be reaching the 25 return list limit, and may need to type a few more characters to narrow down the search.
Generating a Tabular Report
Once a Data View is defined, it can be used to create tabular reports for viewing or export. This report will include the latest project data at the time of execution.
-
To view data in the report, go to a project and click the name of the Data View from the Data Views tab.
-
This will start the report generation and the report job will go into the Queue. For smaller reports, the report will be displayed within a couple seconds.
-
For larger reports, you may have to wait, but you can leave the page and check back on the Queue tab to find the latest status of the report.
-
Reports are saved in the queue for 30 days. For each report you can view the report, save the report to the project, download the report, or delete the report. If you want the report to remain in long term storage, you should save the report to the project or download the report. Download supports CSV, TSV, JSON, JSON-Flat, or in JSON row-column format.
Data View Filtering
Data Views can be defined with a filtering step that limits the data in the report to reflect specific criteria. Press the Edit Filters button to define the inclusion criteria.
Filtering is straightforward on containers and metadata. Filtering on file attributes can lead to unexpected results or errors because files can exist at all levels of the hierarchy and so file filtering depends on appropriately setting the container level of the file. See the next section for details.
File Level Settings and Filename Filtering
When columns beginning with 'file' are selected in the Data View, they have some special settings for setting the container level of file (subject, session, or acquisition) and file name filtering. Since files can exist at multiple levels of the hierarchy you need to specify the level. The level is also constrained by the other columns you selected in the Data View. For example, if you included a session level column, then the container level you select could be either session or acquisition but not subject. In other words, if you want to select files from the session level of your project, you first select a file column, then you may select session, subject or project level columns, but you must not select acquisition level columns.
Filename filtering can be done via direct match (with * as wildcard) or as a regular expression.
Note
A regular expression that has incorrect syntax will cause the data view to return the status of Failed in the queue.
Data Views - Grouping and Aggregation
In some cases you may want to group data and provide a numerical summary of the groups. For example, the count of subjects by subject.sex
. In this case you would first select the grouping variable(s), which must reflect text or integers. Then the aggregation method (for example 'count') and variables (reflecting numeric columns). This will result in a summarized data set.
-
To select a column for grouping, click on the Group by this column checkbox.
-
To select a column for aggregation, pick the Aggregation Method to apply to the column.
Aggregation methods include Count, Min, Max, Sum, Mean, and Standard Deviation (population). The Count aggregation method can work on any type of data (String, Float, Integer, Boolean). The other aggregation methods only operate on Float or Integer.
-
Optional - Changing the data type for Grouping or Aggregation
Flywheel allows flexible data typing when creating metadata. For example, it is possible to define subject.info.status data for two subjects each having different data types. This can be a challenge when creating tabular data from mixed type data. Using the Coerce Type setting allows you to change how the data view interprets the data. Coercion will only work if the data can be retyped. For example a String of '1.0' can be coerced to a Float of 1, but a String of 'one' can not be coerced to a Float. In this situation, a row will show an error in the error column of the data to indicate this issue. When a data with type Float is coerced to an Integer, it will be truncated after the decimal point.
Tip
If you want to apply different aggregation methods to the same data column (Min, Max, Mean), you can do that by adding the same column multiple times to the Data View and using the Alias field on the General tab to give them different aliases. You can then set the aggregation method for each of these aliased columns. Also, if you want to apply aggregation to the same column used for grouping you need to create an alias for the aggregation.
Limitations
When using Grouping, the fields used for Filtering must be ones that are higher in hierarchy than the lowest level Aggregated, or must be exactly the fields Aggregated, otherwise an error will result.
Advanced Options
Under the Advanced Options button, there are options for handling missing data and for reporting numerical warnings for aggregation.
By default, rows with missing data are shown in the report. If that is not desired you can select to hide those rows. Also by default an error column is added to the report. This column is used to show any numerical warnings or errors that occurred as part of the aggregation. For example, if missing values are encountered in the data, the aggregation will show those cases in the error column. This can also be disabled.
The Flywheel SDK provides some additional features for Data Views, including incorporating data files and analysis results files into the Data View alongside the Flywheel metadata.
Info
More technical information about Data Views, including available columns and technical specifications can be found here.