It is important to remove personal health information (PHI) or de-identify data before importing it to Flywheel. The Flywheel de-identification profile allows you to configure de-identification settings for each DICOM field. Use the profile when you import existing DICOM files via the Flywheel CLI.
This article explains how to configure the de-identification profile as well as how to test the de-identification before importing your data to Flywheel.
Note
Other ways to de-identify data
Flywheel offers other methods for de-identifying data depending on the type of data you are uploading:
-
Flywheel connector: Your Flywheel connector can be configured to de-identified data once it leaves the scanner and before it is sent to Flywheel. This is customized for your institution and is often different for each research center. Contact your site administrator for details.
-
Uploading DICOMs via your web browser:When de-identification is enabled, this option removes the following DICOM fields:
Patient ID
,Patient Name
, andPatient Date of Birth
. -
--de-identify option in the Flywheel CLI:This option also removes the following DICOM fields:
Patient ID
,Patient Name
, andPatient Date of Birth
-
Private DICOM tags: You can define your DICOM tag names for these private tags, and then add them to de-identification profile.
Before you begin
-
Follow these instructions to download and install the Flywheel CLI. If you cannot download the Flywheel CLI to your computer, you can upload smaller batches of files using your web browser.
Follow these steps to create a YAML or JSON file that will be your de-identification profile when uploading DICOMs via the Flywheel CLI.
-
Open a plain text editor (for example, Sublime, TextEdit, or Notepad)
-
Below is an example de-identification profile that uses the
.yaml
format. To get started, copy and paste the example in to the text editor:Tip
What is YAML? YAML, which stands for YAML Ain't Markup Language, is a data serialization language. The Rollout Blog has a good explanation of the basic structure of a
.yaml
file. You can verify your YAML is valid by using the online tool YAMLlint.--- # You can give your de-identification profile a name profile: Config1 # Indicates where you want to place the de-id log. You will use this log file to preview # the de-id updates before uploading deid-log: ~/Documents/deid_log.csv # Sets the filetype to DICOM dicom: # Date-increment controls how many days to offset each date field # where the increment-date (shown below) is configured. #Positive values will result in later dates, negative # values will result in earlier dates. date-increment: -17 # patient-age-from-birthdate sets the DICOM header as a 3-digit value with a suffix # be 091D, and that same age in months would be 003M. By default, if # the age fits in days, then days will be used, # otherwise if it fits in months, then months # will be used, otherwise years will be used patient-age-from-birthdate: true # Set patient age units as Years. Other options include months (M) and days (D) patient-age-units: Y # The following are field transformations. # Remove, replace-with, increment-date, hash, and hashuid can be used with any DICOM # field. Replace name with the DICOM field "keyword" by the DICOM standard fields: # Use remove Remove a dicom field Removes the field from the DICOM entirely. # If removal is not supported then this field will be blank. # This example removes PatientID. - name: PatientID remove: true # Replace a dicom field with the value provided. # This example replaces “StationName” with "XXXX" in Flywheel - name: StationName replace-with: XXXX # Offsets the date by the number of days defined in # the date-increment setting above, preserving the time # and timezone. In this example, StudyDate appears as 17 days earlier - name: StudyDate increment-date: true # One-Way hash a dicom field to a unique string - name: AccessionNumber hash: true # Replaces a UID field with a hashed version of that # field. The first four nodes (prefix) and last node # (suffix) will be preserved, with the middle being # replaced by the hashed value - name: ConcatenationUID hashuid: true
-
To convert the text file in to YAML, save the file with the
.yaml
extension. -
Update the settings as needed for your data.
We recommend testing your the de-identification before uploading the data to Flywheel. This makes sure Personal health information (PHI) is not uploaded to Flywheel. Follow these steps to preview how your data is de-identified.
-
To create your import command, start with the basic DICOM import
fw import dicom [flags] <DICOM_folder_location> <group_id> <project_label>
-
Add the
--profile
and--output-folder
flag to configure where to place the preview of your de-identified data:fw import dicom --profile <location/of/profile> --output-folder <location/to/preview> <DICOM_Folder_Location> <group_ID> <project_labels>
For example:
fw import dicom --profile ~/Documents/DeIDFileExample --output-folder ~/Documents/TestDeID ~/Desktop/flywheel psychology "Anxiety Study"
-
The Flywheel CLI displays a hierarchy of the DICOM data it found.
-
When prompted with
Confirm upload(yes/no)
, enteryes
.The Flywheel CLI displays its progress. The Flywheel CLI does not upload the data to Flywheel when you have the
--output-folder
flag. -
Once complete, go to your designated output folder to preview how your data will look when you upload it to Flywheel.
-
To verify that your de-identification is working as expected, go to location of the de-id log you set in the profile. The log file shows the value of each field before and after de-identification.
Warning
The de-identification log includes PHI because it shows the DICOM fields before and after the de-identification. Depending on your environment, you may need to delete the log file after you review it.
Tip
See How to import DICOM Files Using the Flywheel CLI for more information on how to import files using the Flywheel CLI.
Table 1. Global de-identification settings
Setting |
Input |
Description |
---|---|---|
salt |
string |
This optional salt string is used for all hash-based field transformations. Using a different salt value will result in different (but consistent) values for hashed fields. This value can be any string. |
date-increment |
integer |
When set, this controls how many days to offset each date field where the increment-date action is chosen. Positive values will result in later dates, negative values will result in earlier dates. Incrementing by a multiple of 7 will keep the week-day consistent for shifted dates. |
patient-age-from-birthdate |
true/false |
When set to true, this sets the PatientAge DICOM header as a 3-digit value with a suffix indicating units. For example an age in days would be 091D, and that same age in months would be 003M. By default, the age will be set using a best-fit approach. This means if the age fits in days, then days will be used, otherwise if it fits in months, then months will be used, otherwise years will be used. |
patient-age-units |
D,M,Y |
When set in conjunction with patient-age-from-birthdate, this will act as a preference for which units to use. If the value does not fit into the desired unit, the next level of units will be used. The most common use for this field would be to always use years as the patient age. Valid values are D, M, Y for Days, Months and Years . |
Table 2. DICOM field transformations
Transformation |
Input |
Description |
---|---|---|
remove |
true/false |
Removes the field from the DICOM entirely. If removal is not supported then this will blank the field. |
replace-with |
string |
Replaces the contents of the field with the value provided. Please be aware of the the length of the field being replaced. (e.g. some DICOM fields only support a limited number of characters) |
hash |
true/false |
Replace the contents of the field with a one-way cryptographic hash, in hexadecimal form. Only the first 16 characters of the hash will be used, in order to support short strings. |
increment-date |
true/false |
Offsets the date by the number of days defined in the date-increments setting. |
increment-datetime |
true/false |
Offsets the date by the number of days defined in the date-increment setting, preserving the time and timezone. |
hashuid |
true/false |
Replaces a UID field with a hashed version of that field. The first four nodes (prefix) and last node (suffix) will be preserved, with the middle being replaced by the hashed value. For example: 1.2.840.113619.6.283.4.983142589.7316.1300473420.841 Becomes 1.2.840.113619.551726.420312.177022.222461.230571.501817.841 |