Description
This pipeline can be used to extract PHI information such as LOCATION
, CONTACT
, PROFESSION
, NAME
, DATE
, ID
, AGE
, COUNTRY
, SSN
, ACCOUNT
, DLN
, PLATE
, VIN
, LICENSE
, PHONE
, ZIP
, MEDICALRECORD
, EMAIL
, IPADDR
entities.
Predicted Entities
LOCATION
, CONTACT
, PROFESSION
, NAME
, DATE
, ID
, AGE
, COUNTRY
, SSN
, ACCOUNT
, DLN
, PLATE
, VIN
, LICENSE
, PHONE
, ZIP
, MEDICALRECORD
, EMAIL
, IPADDR
How to use
from sparknlp.pretrained import PretrainedPipeline
deid_pipeline = PretrainedPipeline("ner_deid_generic_context_augmented_pipeline", "en", "clinical/models")
text = """Name : Hendrickson, Ora, Record date: 2093-01-13, MR: 719435.
Dr. John Green, IP 203.120.223.13.
He is a 60-year-old male was admitted to the Day Hospital for cystectomy on 01/13/93.
Patient's ID: 764543, VIN : 1HGBH41JXMN109286, SSN #333-44-6666, Driver's license no: A334455B.
Phone (302) 786-5227, 0295 Keats Street, San Francisco, E-MAIL: smith@gmail.com."""
result = deid_pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val deid_pipeline = PretrainedPipeline("ner_deid_generic_context_augmented_pipeline", "en", "clinical/models")
val text = """Name : Hendrickson, Ora, Record date: 2093-01-13, MR: 719435.
Dr. John Green, IP 203.120.223.13.
He is a 60-year-old male was admitted to the Day Hospital for cystectomy on 01/13/93.
Patient's ID: 764543, VIN : 1HGBH41JXMN109286, SSN #333-44-6666, Driver's license no: A334455B.
Phone (302) 786-5227, 0295 Keats Street, San Francisco, E-MAIL: smith@gmail.com."""
val result = deid_pipeline.fullAnnotate(text)
Results
| | chunk | begin | end | entity |
|---:|:------------------|--------:|------:|:--------------|
| 0 | Hendrickson, Ora | 7 | 22 | NAME |
| 1 | 2093-01-13 | 38 | 47 | DATE |
| 2 | 719435 | 54 | 59 | MEDICALRECORD |
| 3 | John Green | 66 | 75 | NAME |
| 4 | 203.120.223.13 | 81 | 94 | IPADDR |
| 5 | 60 | 105 | 106 | AGE |
| 6 | Day Hospital | 142 | 153 | LOCATION |
| 7 | 01/13/93 | 173 | 180 | DATE |
| 8 | 764543 | 197 | 202 | ID |
| 9 | 1HGBH41JXMN109286 | 211 | 227 | VIN |
| 10 | #333-44-6666 | 234 | 245 | SSN |
| 11 | A334455B | 269 | 276 | DLN |
| 12 | (302) 786-5227 | 285 | 298 | PHONE |
| 13 | 0295 Keats Street | 301 | 317 | LOCATION |
| 14 | San Francisco | 320 | 332 | LOCATION |
| 15 | smith@gmail.com | 343 | 357 | EMAIL |
Model Information
Model Name: | ner_deid_generic_context_augmented_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 5.3.2+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 1.7 GB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- WordEmbeddingsModel
- MedicalNerModel
- NerConverter
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- TextMatcherInternalModel
- ContextualParserModel
- RegexMatcherModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- RegexMatcherInternalModel
- ChunkMergeModel
- ChunkMergeModel