The Flywheel CLI as of 6.2.0 supports a number of configurable options around how de-identification happens before data are uploaded. The majority of these options are configured via de-identification profile files, which can either be YAML or JSON.
NOTE: the "name" of the DICOM field is actually the "keyword" by the DICOM standard.
An example config.yaml looks like this:
# Start with the empty profile
# Map subjects according to settings in subjects.csv
# Log de-identification actions that were taken (before/after values)
# Configuration for dicom de-identification
# What date offset to use, in number of days
# Set patient age from date of birth
# Set patient age units as Years
# Remove a dicom field (e.g.remove PatientID)
- name: PatientID
# Replace a dicom field value (e.g. replace “StationName” with "XXXX")
- name: StationName
# Increment a date field by -17 days
- name: StudyDate
# One-Way hash a dicom field to a unique string
- name: AccessionNumber
# One-Way hash the ConcatenationUID,
# keeping the prefix (4 nodes) and suffix (2 nodes)
- name: ConcatenationUID
Most de-identification settings are defined on a per-file type basis, with dicom being the only supported file type as of this writing. There are a few global settings to discuss before looking at what field transformations are supported.
Dicom File Settings
The following global dicom settings are available:
This optional salt string is used for all hash-based field transformations. Using a different salt value will result in different (but consistent) values for hashed fields. This value can be any string.
When set, this controls how many days to offset each date field where the increment-date action is chosen. Positive values will result in later dates, negative values will result in earlier dates. Incrementing by a multiple of 7 will keep the week-day consistent for shifted dates.
When set to true, this will set the PatientAge dicom header as a 3-digit value with a suffix indicating units. For example an age in days would be 091D, and that same age in months would be 003M. By default, the age will be set using a best-fit approach. (i.e. if the age fits in days, then days will be used, otherwise if it fits in months, then months will be used, otherwise years will be used)
When set in conjunction with patient-age-from-birthdate, this will act as a preference for which units to use. If the value does not fit into the desired unit, the next level of units will be used. The most common use for this field would be to always use years as the patient age. Valid values are ‘D’, ‘M’, ‘Y’ for Days, Months and Years respectively.
The following field transformations are supported:
Removes the field from the dicom entirely. If removal is not supported then this will blank the field.
Replaces the contents of the field with the value provided. Please be aware of the the length of the field being replaced. (e.g. some DICOM fields only support a limited number of characters)
Replace the contents of the field with a one-way cryptographic hash, in hexadecimal form. Only the first 16 characters of the hash will be used, in order to support short strings.
Offsets the date by the number of days defined in the date-increment setting.
Offsets the date by the number of days defined in the date-increment setting, preserving the time and timezone.
Replaces a UID field with a hashed version of that field. The first four nodes (prefix) and last node (suffix) will be preserved, with the middle being replaced by the hashed value. For example:
Setting the global deid-log path will result in a log file being created. This CSV file contains a before and after record for each dicom file that has been de-identified, with one column per field that was transformed.