BIDS Curation Tutorial Part 4: Interpreting the Curation Report

Introduction

The last thing the BIDS Curation gear does is produce 5 .csv files that report the state of the BIDS curation. This article walks through these spreadsheets in detail as well as how to troubleshoot any issues, so you can move on to the next step of data processing.

Instruction Steps

Why Do I Need to Make These Updates?

Even though the BIDS Curation gear finished successfully, there is no guarantee that the project has been curated correctly. It is not possible for the gear to know if the project has been properly curated because it cannot know why a particular scan was acquired and what purpose it has in the subsequent analyses. It is important to examine the curation reports in detail to make sure that every subject has all of the expected acquisitions and that they end up with the proper BIDS paths and names. Additionally, errors may only be found in post-curation processing steps because BIDS App algorithms that process BIDS formatted data may have additional requirements that are not examined by the BIDS Curation gear or even the BIDS Validator. The information in the spreadsheets will help flush out as issues now instead of later on when you are running a BIDS App.

The Results

Assuming you ran the BIDS Curate gear at the Project level, you can access the reports from the Analyses tab. Click on the curate-bids gear run and the Results tab.

To view the .csv files in Flywheel, click on the three dots under Tasks to the right of the desired csv file and select Launch in Tabular Data. You can also choose Download from this same menu, to save a copy to your local machine.

The spreadsheets produced are:

{group}_{project}_niftis.csv: A list of the original information (acquisition name, file name, series number, etc.) to the final BIDS path/filename. A column indicates if the path/filename is duplicated, which will result in an error for the BIDS Validator. BIDS Apps run the BIDS Validator and won't work if the BIDS path and filenames are incorrect. This spreadsheet should be checked to see if all of the files have been properly recognized or ignored.
{group}_{project}_acquisitions.csv: lists warnings if there are an unexpected number of specific acquisitions or if there are subjects that do not have the expected number of the usual acquisitions. This is useful when there are multiple subjects because it shows the “usual count” of the acquisitions for all subjects and shows subjects that do not have the usual count.
{group}_{project}_acquisitions_details_1.csv (_2.csv): lists all of the unique acquisition labels along with the number of times they have been seen. It also provides additional details that should help determine which subjects have missing or additional acquisitions.
{group}_{project}_intendedfors.csv: lists the field maps and then the paths to the files that those maps are going to be used to correct. If the IntendedFor regular expression pairs are provided, it will list the mapping provided by processing using the project curation template as the “before” results and also the after using the regexes to trim down those results.

We will go over the first four spreadsheets in this tutorial. We'll leave the _intendedfors.csv for the next tutorial.

The NIfTI Spreadsheet

Here is an example of a {group}_{project}_niftis.csv spreadsheet. Revisiting Dataset 1 from the BIDS Tutorial Parts 1-3, suzanne is the group and BIDS-tutorial-test-no-precuration-2 is the project, so the file is named suzanne_BIDS-tutorial-test-no-precuration-2_niftis.csv:

Note that if it is difficult to see the above image, you can right-click on the image, and open it in a new tab.

This spreadsheet provides all the information necessary to understand how the Curated BIDS Path was determined for every file in the project. The Curated BIDS Path shows the BIDS folder (anat, dwi, fmap, or func) and the filename. Remember, the goal of BIDS Curation is to get the BIDS path correct so when you run a gear or export data in BIDS format the proper names are assigned to the NIfTI and JSON sidecar files and they are placed in the proper folder.

You can also have data in the ignored folder. Data in the ignored folder is not included when you turn on BIDS View in Flywheel, use a BIDS App, or export data in BIDS format. You can add the "ignore" metadata flag to a particular file, entire sessions, or acquisitions. The default project curation template sets this flag on the acquisition if the acquisition name ends in _ignore-bids or _ignore-BIDS. The Ignored column includes either an "S", "A", or "F" to indicate that a particular file was ignored at the session, acquisition, or file level. For Dataset 1, no acquisitions were marked to be ignored, so this folder is empty and is not listed in the csv file.

After the ignored list, the next folder in the Curated BIDS Path is sourcedata. This BIDS folder holds DICOM scans from which the NIfTIs were created. Some BIDS App algorithms require source data, but many do not.

The last folder is listed in the spreadsheet as unrecognized. Data in the unrecognized folder appears in the nonBids section in BIDS View and will have their BIDS metadata field set to NA. Unrecognized data means there was no project curation template rule that recognized those files.

Below is the same niftis.csv spreadsheet scrolled down to show the sourcedata and unrecognized values in the Curated BIDS path column:

The other columns in this spreadsheet are included to help identify the files. A project curation template can use data in any of these columns to recognize a scan or initialize a BIDS field. The recommended reproin template is focused almost exclusively on the acquisition label, which is usually determined by the SeriesDescription DICOM tag. The reproin template uses regular expressions to match and extract strings from the acquisition label.

The Rule ID column indicates which rule in the template matched each file in the spreadsheet. If it is blank, no rule matched, and the file is added to the unrecognized folder. The rule refers to the project curation template, which is composed of two main sections called definitions and rules. The definitions set what information is required for a particular BIDS entity like anatomical scan, functional scan, etc, and the arrangement of that information in the BIDS file name. The rules determine how a particular file is recognized by the template in the "where" clause, and also sets some necessary BIDS fields in the "initialization" clause.

Finally, the Unique? column indicates whether a BIDS filename is unique or duplicate. BIDS requires all filepaths to be unique, so two files with the same BIDS-compliant path and filename cannot co-exist.

Common Errors

Duplicate BIDS Path/Name

When a duplicate BIDS filepath detected, the column, Unique? shows duplicate . Duplicates can be caused by many reasons. Generally, duplicates can be avoided through careful use of ReproIn-compliant names at the scanner or careful relabeling of acquisition labels to ReproIn-compliant names. Below are some common reasons for duplicate Curated BIDS path instances:

A scan is repeated because the participant moved or needed to leave the magnet for a while so the same scan is restarted. To fix this, it is recommended to postpend the unwanted acquisition label with either _ignore-bids or _ignore-BIDS.
You can also check the ignore box in the acquisition's BIDS metadata.
Different scans are acquired, but there is no information in the acquisition label to differentiate them. An example of this would be running multiple T1w scans using different sequences (e.g., MPRAGE vs MP2RAGE). Or, for fieldmaps or diffusion scans, scans are acquired using reverse phase-encoding. The BIDS standard provides multiple optional descriptors (e.g., acq-<label>, dir-<label>) to help differentiate between similar acquisitions.
Multi-echo scans are acquired, but there is no information in the acquisition label to differentiate between the echoes. The BIDS standard includes an optional echo-<index> to label different echoes from the same multi-echo sequence.
For functional acquisitions, multiple identical runs may be purposefully acquired. In this case, adding the run-<index> BIDS descriptor to the acquisition label allows for multiple, identical acquisitions to co-exist in the same BIDS filepath.

Missing BIDS Field

When you see curly braces in the Curated BIDS Path, it means that some BIDS field was not properly detected.

For example, if the file name is sub-001_ses-01_task-{file.info.BIDS.Task}_bold.nii.gz, it means that the Task label was not detected.

In the ReproIn BIDS Curation template for the reproin_func_file rule, the Task is found by this regular expression:

"Task": {
  "acquisition.label": {
  "$regex": "(^|_)task-(?P{value}.*?)(_(acq|ce|dir|echo|mod|proc|rec|recording|run|task)-|$|_)"
  }
},

That is, the value of the Task field is set by whatever follows task- in the acquisition label (which can be followed by an underscore _ and then by various other possible descriptors). If the task label is missing or doesn't follow what this regular expression expects, it will be left blank and the curly braces will appear. This is not a valid BIDS name and will cause an error. Missing BIDS field errors are best fixed by using the Relabel Container gear to update the acquisition label(s). Remember that the BIDS standard does not allow special characters or blank spaces in any <label> field.

Missing Acquisitions

If a scan is not properly named, have the correct Intent, or if there is no rule in the project curation template to recognize that acquisition, it is listed in the spreadsheet as unrecognized and will be listed under the nonBids section in BIDS View.

Depending on the reason why the scan is unrecognized, there are a few ways to fix this error:

If the acquisition label is not ReproIn-compliant, then manually fixing the acquisition label or re-running the relabel-container gear is the way to go.
If the acquisition was mis-classified by the File Classifier gear with the wrong Intent in the metadata, then manually fixing this should correct the error. If mis-classification is occurring on a larger scale, then looking into setting up a study-specific File Classifier template would be more efficient than manually fixing multiple acquisitions across multiple subjects and sessions.
If your study is using its own custom BIDS curation template, then editing an existing or adding a new rule will fix the issue of the acquisition not being recognized by any template rule. If you are using the default reproin template, it may be that your acquisition was recently added to the BIDS standard and is not yet included in Flywheel's default reproin template. For more experimental acquisitions, it may be that the acquisition is not yet in the BIDS standard but is covered by a BIDS Extension Proposal (BEP).
Flywheel should be able to help you get a study-specific extension to the default template for either of these two latter scenarios.

The Acquisition Spreadsheets

There are three spreadsheets that describe acquisitions in the project:

{group}_{project}_acquisitions.csv
{group}_{project}_acquisitions_details_1.csv
{group}_{project}_acquisitions_details_2.csv

The acquisitions.csv file lists the common acquisition labels along with the Usual Count across all subjects. The Usual Count is the number of times a specific acquisition appears for most subjects. This is calculated by counting the number of times an acquisition with a specific name appears for all subjects (a histogram) and using the most common. This is a way to figure out which scans are important. If a particular acquisition label was not acquired for most subjects, the most common number will be zero, and it won't appear in this list. For properly named acquisitions, the important ones will likely have a number of 1, which means this scan should be acquired for all subjects. But it can also be greater than 1 if some other information is used to disambiguate the Curated BIDS name (such as the echo number or something that sets the run-<index> BIDS descriptor).

This spreadsheet lists every subject and prints warnings about extra or missing scans. First it lists Subjects that have all of the Typical Acquisitions and includes warnings about extra scans. Then it lists Subjects that don't have Typical Acquisitions and provides warnings indicating that the usual scans are missing. Usually there are valid reasons for missing scans, however, if you know that the scan should be there and it is showing up as missing, that is a good indicator that something may be wrong with the dicom file or something happened during the classification, relabeling, or curation.

In general for BIDS, missing or extra scans should not cause any errors with the BIDS validator per se. However, many BIDS App gears (and even non-BIDS App gears) have specific input file requirements, where missing scans could cause an error. For example, fMRIPrep requires at minimum a single T1w scan as input, so any subject missing a T1w scan will throw an immediate error when trying to run through fMRIPrep.

The acquisitions_details_1.csv file gives the total number of subjects and sessions and provides a list of all of the unique acquisition labels in the project along with the number of times that label was found. For the usual acquisitions, there should be as many of a particular label as there are subjects and sessions.

The acquisitions_details_2.csv file lists acquisition labels for each subject but only if the number found for that subject is not equal to the expected number (the number that most subjects have). These two spreadsheets are designed to find subject or acquisition labels that are outliers while the acquisitions.csv spreadsheet shows what is common for most subjects.

Finishing up with Part 4

Now that you have a basic understanding of what is contained in the curation reports and how to use it to track down missing/extra acquisitions and errors in BIDS filenames and filepaths, we can move on to the last step that covers how to handle fieldmaps and make sure that they are used to correct the relevant functional and/or diffusion acquisitions.

If your study is collecting fieldmaps, continue on to Part 5 Field Maps and IntendedFors to learn how to set the BIDS IntendedFors field during BIDS curation.