Skip to content

Create a De-ID Profile

In Flywheel, de-identification is configured using a de-id profile. A de-id profile is a set of instructions for what to do with metadata that may include PHI.

This article explains how to create a de-id profile to remove or transform sensitive DICOM data. There is also a reference guide detailing all possible data transformations with examples.

Warning

This feature is not guaranteed to satisfy any specific regulatory or compliance requirements. It is your responsibility to ensure that you set the appropriate configuration parameters and evaluate the end result to determine whether it is acceptable for your use cases and any regulatory or compliance requirements you may have.

What is the de-id Profile?

A de-id profile is a set of instructions for what to do with metadata that may include PHI. De-id profiles can de-identify standard DICOM tags such as PatientName, StudyDate, and PatientAge, as well as private tags unique to your institution.

In general, there are 2 levels of de-identification:

  • File settings: At this level, the de-identification settings apply to all DICOM data by default. For example:
  • remove-private-tags to remove all non-standard DICOM tags.
  • recurse-sequence, which cascades de-identification transformations down an entire sequence of nested tags.
  • Fields settings: These settings give you finer control over what specific DICOM tags are de-identified and HOW they are de-identified.
    • This setting takes precedent over the file settings, so you can add exceptions to a rule.
    • For example: you can set remove-private-tags to true at the file level, but then choose to keep a specific custom tag by using the field transformation keep.

How does it work with Flywheel?

All upload methods offer the option to use a profile to de-identify data at the edge of the Flywheel platform. De-identifying at the edge means only de-identified data is uploaded to Flywheel. Then each time you import data into Flywheel — whether by the Connector, CLI, Web Uploader, or SDK– your data is de-identified according to your profile.

See the de-identification overview article for more details.

Instructions

Download the Flywheel CLI

Creating and testing de-id profiles is easiest with the Flywheel CLI. Learn more about how to download the CLI and sign in to your Flywheel account.

Generate a De-id Template

To begin, we will generate a de-id template. In later steps we will update the template to fit your data

  1. Open the Terminal or Windows Command Prompt app on your computer.
  2. Navigate to the Flywheel CLI.
  3. Enter the following command:

    • Windows: fw deid create C:\Users\[username]\Documents\deid_profile.yaml

    • Mac/Linux: fw deid create ~/Documents/deid_profile.yaml

  4. You will see the following message: Sample template successfully created

    deidCreateProfile.png

  5. Open the deid_profile.yaml file in a plain text editor such as TextEdit, notepad, or Sublime. You will see the following template:

### You can give your de-identification profile a name

name: custom

# Indicates where you want to place the de-id log. You will use this log file to preview
# the de-id updates before uploading
# The option is ignored in ingest, you can use --save-deid-logs PATH to save the log.

deid-log: ~/Documents/deid_log.csv

# Sets the filetype to DICOM

dicom:

  # Date-increment controls how many days to offset each date field
  # where the increment-date (shown below) is configured.
  #Positive values will result in later dates, negative
  # values will result in earlier dates.

  date-increment: -17

  # patient-age-from-birthdate sets the DICOM header as a 3-digit value with a suffix
  # be 091D, and that same age in months would be 003M. By default, if
  # the age fits in days, then days will be used,
  # otherwise if it fits in months, then months
  # will be used, otherwise years will be used

  patient-age-from-birthdate: true

  # Set patient age units as Years. Other options include months (M) and days (D)

  patient-age-units: Y

  # The following are field transformations.
  # Remove, replace-with, increment-date, hash, and hashuid can be used with any DICOM
  # field. Replace name with the DICOM field "keyword" by the DICOM standard
  fields:

    # Use remove Remove a dicom field Removes the field from the DICOM entirely.
    # If removal is not supported then this field will be blank.
    # This example removes PatientID.

    - name: PatientID
      remove: true

    # Replace a dicom field with the value provided.
    # This example replaces “StationName” with "XXXX" in Flywheel

    - name: StationName
      replace-with: XXXX

    # Offsets the date by the number of days defined in
    # the date-increment setting above, preserving the time
    # and timezone. In this example, StudyDate appears as 17 days earlier

    - name: StudyDate
      increment-date: true

    # One-Way hash a dicom field to a unique string

    - name: AccessionNumber
      hash: true

     # Replaces a UID field with a hashed version of that
     # field. The first four nodes (prefix) and last node
     # (suffix) will be preserved, with the middle being
     # replaced by the hashed value

    - name: ConcatenationUID
      hashuid: true

Determine how your data needs to be de-identified

You are responsible for ensuring the de-identified data is acceptable for your use cases and meets any regulatory or compliance requirements you have.

One important piece to consider when determining your de-identification needs is that Flywheel uses certain DICOM tags to organize the images in groups, projects, subjects, sessions, and acquisitions during the import process.

The following are the default DICOM tags Flywheel uses to sort DICOM images. Your Flywheel site may use different tags for sorting data, so check with your institution's Flywheel admin for the specific sorting tags.

We do not recommend removing the sorting tags altogether when de-identifying data. Instead, use one of the other transformation methods such as replace-with. This allows Flywheel to automatically group related DICOM images while still de-identifying data.

Keyword Tag Flywheel Field
Patient ID (0010,0020) Group ID (When uploading via Connector)
Project Label (When uploading via Connector)
Subject ID
Study Instance UID (0020,000D) Session UID
Study Description (0008,1030) Session Label
Series Instance UID (0020,000E) Acquisition UID
Series Description (0008, 103E) Acquisition Label

Update the YAML File

Once you have determined how you want to de-identify your data, the next step is to update the YAML file.

  1. Update the YAML file with the appropriate transformations.
  2. Confirm that your profile is valid YAML. You can use an online tool like YAML Lint.

Test

Test your de-id profile locally before uploading sensitive data to Flywheel.

Example de-id Profiles

Create a Keeplist

One option for removing many DICOM tags is to create a keeplist using remove-undefined. This means that any DICOM tags not specified under fields are removed. This example also shows a few basic transformations of the fields in the keeplist:

name: My Profile
description: version 1 of an example de-id profile

dicom:
remove-undefined: true
  fields:
  - name: PatientID
     replace-with: 001
  - name: PatientBirthdate
     jitter: true
     jitter-range: 10
  - name: PatientSex
  - name: PatientAge
  - name: StudyDate
     jitter: true
  - name: AcquisitionDate
  - name: EthnicGroup
  - name: SOPClassUID
  - name: SeriesDescription
  - name: StudyDescription

Create a Blocklist

Let's say you have the following requirements for your data:

  • Remove some fields that have PII and are not needed in Flywheel.
    • Accomplish this by listing the tags under fields and adding remove: true. This creates a blocklist.
  • Offset dates by a consistent number.
    • Accomplish this by using date-increment.
  • Use the default sorting tags in your environment and want to replace the value of some fields with "REDACTED" or a hash.
  • Keep one private DICOM tag to use in Flywheel, but remove the rest
    • Accomplish this with remove-private-tags

Here's an example of a de-id profile that would satisfy the above requirements:

name: My Profile
description: version 1 of an example de-id profile

dicom:
remove-private-tags: true
date-increment: 14
  fields:
    - name: PatientID
      replace-with: REDACTED
    - name: StudyInstanceUID
      hashuid: true
    - name: SeriesInstanceUID
      hashuid: true
    - name: SOPInstanceUID
      hashuid: true
    - name: PatientName
      remove: true
    - name: (0009, "GEMS_IDEN_01", 1004)
      keep: true
    - name: AccessionNumber
      remove: true
    - name: InstitutionName
      remove: true
    - name: InstitutionAddress
      remove: true
    - name: ReferringPhysicianName
      remove: true
    - name: ReferringPhysicianAddress
      remove: true
    - name: ReferringPhysicianTelephoneNumbers
      remove: true
    - name: InstitutionalDepartmentName
      remove: true
    - name: PhysiciansOfRecord
      remove: true
    - name: PerformingPhysicianName
      remove: true
    - name: NameOfPhysiciansReadingStudy
      remove: true
    - name: OperatorsName
      remove: true
    - name: AdmittingDiagnosesDescription
      remove: true
    - name: PatientBirthTime
      remove: true
    - name: PatientInsurancePlanCodeSequence
      remove: true
    - name: OtherPatientIDs
      remove: true
    - name: OtherPatientNames
      remove: true
    - name: OtherPatientIDsSequence
      remove: true
    - name: PatientBirthName
      remove: true
    - name: PatientAddress
      remove: true
    - name: PatientMotherBirthName
      remove: true
    - name: MilitaryRank
      remove: true
    - name: MedicalRecordLocator
      remove: true
    - name: PatientTelephoneNumbers
      remove: true
    - name: EthnicGroup
      remove: true
    - name: Occupation
      remove: true
    - name: AdditionalPatientHistory
      remove: true
    - name: ResponsiblePerson
      remove: true
    - name: PatientComments
      remove: true
    - name: ClinicalTrialSponsorName
      remove: true
    - name: ClinicalTrialProtocolID
      remove: true
    - name: ClinicalTrialProtocolName
      remove: true
    - name: ClinicalTrialSiteID
      remove: true
    - name: ClinicalTrialSiteName
      remove: true
    - name: ClinicalTrialSubjectID
      remove: true
    - name: ClinicalTrialTimePointID
      remove: true
    - name: ClinicalTrialTimePointDescription
      remove: true
    - name: ClinicalTrialCoordinatingCenterName
      remove: true
    - name: ProtocolName
      remove: true
    - name: ImageComments
      remove: true
    - name: StudyComments
      remove: true
    - name: RequestingPhysician
      remove: true
    - name: RequestAttributesSequence
      remove: true
    - name: NamesOfIntendedRecipientsOfResults
      remove: true
    - name: PersonIdentificationCodeSequence
      remove: true
    - name: PersonAddress
      remove: true
    - name: PersonTelephoneNumbers
      remove: true
    - name: VerifyingObserverName
      remove: true
    - name: PersonName
      remove: true
    - name: ContentSequence
      remove: true
    - name: ContentCreatorName
      remove: true
    - name: ReviewerName
      remove: true
    - name: OriginalAttributesSequence
      remove: true
    - name: StudyDescription
      remove: true
    - name: DerivationDescription
      remove: true
    - name: ClinicalTrialSeriesDescription
      remove: true
    - name: TherapyDescription
      remove: true
    - name: InterventionDescription
      remove: true
    - name: RequestedProcedureDescription
      remove: true
    - name: AcquisitionProtocolDescription
      remove: true
    - name: ScheduledStationAETitle
      remove: true
    - name: ScheduledPerformingPhysicianName
      remove: true
    - name: DeviceDescription
      remove: true
    - name: DischargeDiagnosisDescription
      remove: true
    - name: StationName
      remove: true
    - name: ScheduledStationName
      remove: true
    - name: PerformedStationAETitle
      remove: true
    - name: PerformedStationName
      remove: true
    - name: PerformedProcedureStepDescription
      remove: true
    - name: DeviceSerialNumber
      remove: true
    - name: PerformedProcedureStepID
      remove: true
    - name: ClinicalTrialSubjectReadingID
      remove: true
    - name: IssuerOfPatientID
      remove: true
    - name: DigitalSignaturesSequence
      remove: true
    - regex: .*IdentificationSequence.*
      remove: true
    - name: NameOfPhysiciansReadingStudy
      remove: true
    - name: FrameOfReferenceUID
      hashuid: true

Next steps

See our article to learn how to test your de-id profile locally before uploading sensitive data to Flywheel.

Resources

De-ID Profile Transformation Guide