Create a De-ID Profile
In Flywheel, de-identification is configured using a de-id profile. A de-id profile is a set of instructions for what to do with metadata that may include PHI.
This article explains how to create a de-id profile to remove or transform sensitive DICOM data. There is also a reference guide detailing all possible data transformations with examples.
Warning
This feature is not guaranteed to satisfy any specific regulatory or compliance requirements. It is your responsibility to ensure that you set the appropriate configuration parameters and evaluate the end result to determine whether it is acceptable for your use cases and any regulatory or compliance requirements you may have.
What is the de-id Profile?
A de-id profile is a set of instructions for what to do with metadata that may include PHI. De-id profiles can de-identify standard DICOM tags such as PatientName, StudyDate, and PatientAge, as well as private tags unique to your institution.
In general, there are 2 levels of de-identification:
- File settings: At this level, the de-identification settings apply to all DICOM data by default. For example:
remove-private-tags
to remove all non-standard DICOM tags.recurse-sequence
, which cascades de-identification transformations down an entire sequence of nested tags.- Fields settings: These settings give you finer control over what specific DICOM tags are de-identified and HOW they are de-identified.
- This setting takes precedent over the file settings, so you can add exceptions to a rule.
- For example: you can set
remove-private-tags
totrue
at the file level, but then choose to keep a specific custom tag by using the field transformationkeep
.
How does it work with Flywheel?
All upload methods offer the option to use a profile to de-identify data at the edge of the Flywheel platform. De-identifying at the edge means only de-identified data is uploaded to Flywheel. Then each time you import data into Flywheel — whether by the Connector, CLI, Web Uploader, or SDK– your data is de-identified according to your profile.
See the de-identification overview article for more details.
Instructions
Download the Flywheel CLI
Creating and testing de-id profiles is easiest with the Flywheel CLI. Learn more about how to download the CLI and sign in to your Flywheel account.
Generate a De-id Template
To begin, we will generate a de-id template. In later steps we will update the template to fit your data
- Open the Terminal or Windows Command Prompt app on your computer.
- Navigate to the Flywheel CLI.
-
Enter the following command:
-
Windows:
fw deid create C:\Users\[username]\Documents\deid_profile.yaml
-
Mac/Linux:
fw deid create ~/Documents/deid_profile.yaml
-
-
You will see the following message:
Sample template successfully created
-
Open the
deid_profile.yaml
file in a plain text editor such as TextEdit, notepad, or Sublime. You will see the following template:
### You can give your de-identification profile a name
name: custom
# Indicates where you want to place the de-id log. You will use this log file to preview
# the de-id updates before uploading
# The option is ignored in ingest, you can use --save-deid-logs PATH to save the log.
deid-log: ~/Documents/deid_log.csv
# Sets the filetype to DICOM
dicom:
# Date-increment controls how many days to offset each date field
# where the increment-date (shown below) is configured.
#Positive values will result in later dates, negative
# values will result in earlier dates.
date-increment: -17
# patient-age-from-birthdate sets the DICOM header as a 3-digit value with a suffix
# be 091D, and that same age in months would be 003M. By default, if
# the age fits in days, then days will be used,
# otherwise if it fits in months, then months
# will be used, otherwise years will be used
patient-age-from-birthdate: true
# Set patient age units as Years. Other options include months (M) and days (D)
patient-age-units: Y
# The following are field transformations.
# Remove, replace-with, increment-date, hash, and hashuid can be used with any DICOM
# field. Replace name with the DICOM field "keyword" by the DICOM standard
fields:
# Use remove Remove a dicom field Removes the field from the DICOM entirely.
# If removal is not supported then this field will be blank.
# This example removes PatientID.
- name: PatientID
remove: true
# Replace a dicom field with the value provided.
# This example replaces “StationName” with "XXXX" in Flywheel
- name: StationName
replace-with: XXXX
# Offsets the date by the number of days defined in
# the date-increment setting above, preserving the time
# and timezone. In this example, StudyDate appears as 17 days earlier
- name: StudyDate
increment-date: true
# One-Way hash a dicom field to a unique string
- name: AccessionNumber
hash: true
# Replaces a UID field with a hashed version of that
# field. The first four nodes (prefix) and last node
# (suffix) will be preserved, with the middle being
# replaced by the hashed value
- name: ConcatenationUID
hashuid: true
Determine how your data needs to be de-identified
You are responsible for ensuring the de-identified data is acceptable for your use cases and meets any regulatory or compliance requirements you have.
One important piece to consider when determining your de-identification needs is that Flywheel uses certain DICOM tags to organize the images in groups, projects, subjects, sessions, and acquisitions during the import process.
The following are the default DICOM tags Flywheel uses to sort DICOM images. Your Flywheel site may use different tags for sorting data, so check with your institution's Flywheel admin for the specific sorting tags.
We do not recommend removing the sorting tags altogether when de-identifying data. Instead, use one of the other transformation methods such as replace-with
. This allows Flywheel to automatically group related DICOM images while still de-identifying data.
Keyword | Tag | Flywheel Field |
---|---|---|
Patient ID | (0010,0020) | Group ID (When uploading via Connector) Project Label (When uploading via Connector) Subject ID |
Study Instance UID | (0020,000D) | Session UID |
Study Description | (0008,1030) | Session Label |
Series Instance UID | (0020,000E) | Acquisition UID |
Series Description | (0008, 103E) | Acquisition Label |
Update the YAML File
Once you have determined how you want to de-identify your data, the next step is to update the YAML file.
- Update the YAML file with the appropriate transformations.
- Confirm that your profile is valid YAML. You can use an online tool like YAML Lint.
Test
Test your de-id profile locally before uploading sensitive data to Flywheel.
Example de-id Profiles
Create a Keeplist
One option for removing many DICOM tags is to create a keeplist using remove-undefined
. This means that any DICOM tags not specified under fields
are removed. This example also shows a few basic transformations of the fields in the keeplist:
name: My Profile
description: version 1 of an example de-id profile
dicom:
remove-undefined: true
fields:
- name: PatientID
replace-with: 001
- name: PatientBirthdate
jitter: true
jitter-range: 10
- name: PatientSex
- name: PatientAge
- name: StudyDate
jitter: true
- name: AcquisitionDate
- name: EthnicGroup
- name: SOPClassUID
- name: SeriesDescription
- name: StudyDescription
Create a Blocklist
Let's say you have the following requirements for your data:
- Remove some fields that have PII and are not needed in Flywheel.
- Accomplish this by listing the tags under fields and adding
remove: true
. This creates a blocklist.
- Accomplish this by listing the tags under fields and adding
- Offset dates by a consistent number.
- Accomplish this by using
date-increment
.
- Accomplish this by using
- Use the default sorting tags in your environment and want to replace the value of some fields with "REDACTED" or a hash.
- Keep one private DICOM tag to use in Flywheel, but remove the rest
- Accomplish this with
remove-private-tags
- Accomplish this with
Here's an example of a de-id profile that would satisfy the above requirements:
name: My Profile
description: version 1 of an example de-id profile
dicom:
remove-private-tags: true
date-increment: 14
fields:
- name: PatientID
replace-with: REDACTED
- name: StudyInstanceUID
hashuid: true
- name: SeriesInstanceUID
hashuid: true
- name: SOPInstanceUID
hashuid: true
- name: PatientName
remove: true
- name: (0009, "GEMS_IDEN_01", 1004)
keep: true
- name: AccessionNumber
remove: true
- name: InstitutionName
remove: true
- name: InstitutionAddress
remove: true
- name: ReferringPhysicianName
remove: true
- name: ReferringPhysicianAddress
remove: true
- name: ReferringPhysicianTelephoneNumbers
remove: true
- name: InstitutionalDepartmentName
remove: true
- name: PhysiciansOfRecord
remove: true
- name: PerformingPhysicianName
remove: true
- name: NameOfPhysiciansReadingStudy
remove: true
- name: OperatorsName
remove: true
- name: AdmittingDiagnosesDescription
remove: true
- name: PatientBirthTime
remove: true
- name: PatientInsurancePlanCodeSequence
remove: true
- name: OtherPatientIDs
remove: true
- name: OtherPatientNames
remove: true
- name: OtherPatientIDsSequence
remove: true
- name: PatientBirthName
remove: true
- name: PatientAddress
remove: true
- name: PatientMotherBirthName
remove: true
- name: MilitaryRank
remove: true
- name: MedicalRecordLocator
remove: true
- name: PatientTelephoneNumbers
remove: true
- name: EthnicGroup
remove: true
- name: Occupation
remove: true
- name: AdditionalPatientHistory
remove: true
- name: ResponsiblePerson
remove: true
- name: PatientComments
remove: true
- name: ClinicalTrialSponsorName
remove: true
- name: ClinicalTrialProtocolID
remove: true
- name: ClinicalTrialProtocolName
remove: true
- name: ClinicalTrialSiteID
remove: true
- name: ClinicalTrialSiteName
remove: true
- name: ClinicalTrialSubjectID
remove: true
- name: ClinicalTrialTimePointID
remove: true
- name: ClinicalTrialTimePointDescription
remove: true
- name: ClinicalTrialCoordinatingCenterName
remove: true
- name: ProtocolName
remove: true
- name: ImageComments
remove: true
- name: StudyComments
remove: true
- name: RequestingPhysician
remove: true
- name: RequestAttributesSequence
remove: true
- name: NamesOfIntendedRecipientsOfResults
remove: true
- name: PersonIdentificationCodeSequence
remove: true
- name: PersonAddress
remove: true
- name: PersonTelephoneNumbers
remove: true
- name: VerifyingObserverName
remove: true
- name: PersonName
remove: true
- name: ContentSequence
remove: true
- name: ContentCreatorName
remove: true
- name: ReviewerName
remove: true
- name: OriginalAttributesSequence
remove: true
- name: StudyDescription
remove: true
- name: DerivationDescription
remove: true
- name: ClinicalTrialSeriesDescription
remove: true
- name: TherapyDescription
remove: true
- name: InterventionDescription
remove: true
- name: RequestedProcedureDescription
remove: true
- name: AcquisitionProtocolDescription
remove: true
- name: ScheduledStationAETitle
remove: true
- name: ScheduledPerformingPhysicianName
remove: true
- name: DeviceDescription
remove: true
- name: DischargeDiagnosisDescription
remove: true
- name: StationName
remove: true
- name: ScheduledStationName
remove: true
- name: PerformedStationAETitle
remove: true
- name: PerformedStationName
remove: true
- name: PerformedProcedureStepDescription
remove: true
- name: DeviceSerialNumber
remove: true
- name: PerformedProcedureStepID
remove: true
- name: ClinicalTrialSubjectReadingID
remove: true
- name: IssuerOfPatientID
remove: true
- name: DigitalSignaturesSequence
remove: true
- regex: .*IdentificationSequence.*
remove: true
- name: NameOfPhysiciansReadingStudy
remove: true
- name: FrameOfReferenceUID
hashuid: true
Next steps
See our article to learn how to test your de-id profile locally before uploading sensitive data to Flywheel.