Dicom De-identification Pipeline

Description

This pipeline can be used to mask PHI information in Dicom. Masked entities include AGE, BIOID, CITY, COUNTRY, DATE, DEVICE, DOCTOR, EMAIL, FAX, HEALTHPLAN, HOSPITAL, IDNUM, LOCATION, MEDICALRECORD, ORGANIZATION, PATIENT, PHONE, PROFESSION, STATE, STREET, URL, USERNAME, ZIP, ACCOUNT, LICENSE, VIN, SSN, DLN, PLATE, and IPADDR. The output is a Dicom document, similar to the one at the input, but with black bounding boxes on top of the targeted entities and de-identified metadata.

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

deid_pipeline = PretrainedPipeline(“dicom_died_pipeline”, “en”, “clinical/models”)

from sparknlp.pretrained import PretrainedPipeline

deid_pipeline = PretrainedPipeline("dicom_died_pipeline", "en", "clinical/models")

Model Information

Model Name: dicom_died_pipeline
Type: pipeline
Compatibility: Healthcare NLP 5.3.3+
License: Licensed
Edition: Official
Language: en
Size: 1.8 GB

Included Models

  • DicomToMetadata
  • DicomToImageV3
  • ImageTextDetectorV2
  • ImageToTextV3
  • DicomDeidentifier
  • PipelineModel
  • PositionFinder
  • DicomDrawRegions
  • DicomMetadataDeidentifier