Pipeline to Detect PHI in medical text

Description

This pretrained pipeline is built on the top of ner_deid_biobert model.

Predicted Entities

LOCATION, CONTACT, PROFESSION, NAME, DATE, ID, AGE

Live Demo Open in Colab Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("ner_deid_biobert_pipeline", "en", "clinical/models")

pipeline.annotate("""A. Record date : 2093-01-13, David Hale, M.D., Name : Hendrickson, Ora MR. # 7194334 Date : 01/13/93 PCP : Oliveira, 25-year-old, Record date : 1-11-2000. Cocke County Baptist Hospital. 0295 Keats Street. Phone +1 (302) 786-5227. Patient's complaints first surfaced when he started working for Brothers Coal-Mine.""")
val pipeline = new PretrainedPipeline("ner_deid_biobert_pipeline", "en", "clinical/models")

pipeline.annotate("A. Record date : 2093-01-13, David Hale, M.D., Name : Hendrickson, Ora MR. # 7194334 Date : 01/13/93 PCP : Oliveira, 25-year-old, Record date : 1-11-2000. Cocke County Baptist Hospital. 0295 Keats Street. Phone +1 (302) 786-5227. Patient's complaints first surfaced when he started working for Brothers Coal-Mine.")
import nlu
nlu.load("en.deid.ner_biobert.pipeline").predict("""A. Record date : 2093-01-13, David Hale, M.D., Name : Hendrickson, Ora MR. # 7194334 Date : 01/13/93 PCP : Oliveira, 25-year-old, Record date : 1-11-2000. Cocke County Baptist Hospital. 0295 Keats Street. Phone +1 (302) 786-5227. Patient's complaints first surfaced when he started working for Brothers Coal-Mine.""")

Results

+-----------------------------+--------+
|chunks                       |entities|
+-----------------------------+--------+
|2093-01-13                   |DATE    |
|David Hale                   |NAME    |
|Hendrickson                  |NAME    |
|Ora                          |NAME    |
|7194334                      |ID      |
|01/13/93                     |DATE    |
|Oliveira                     |LOCATION|
|1-11-2000                    |DATE    |
|Cocke County Baptist Hospital|LOCATION|
|Keats Street                 |LOCATION|
|Brothers                     |LOCATION|
+-----------------------------+--------+

Model Information

Model Name: ner_deid_biobert_pipeline
Type: pipeline
Compatibility: Healthcare NLP 3.4.1+
License: Licensed
Edition: Official
Language: en
Size: 422.0 MB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • BertEmbeddings
  • MedicalNerModel
  • NerConverter