Description
This pipeline can be used to extract PHI information such as LOCATION
, CONTACT
, PROFESSION
, NAME
, DATE
, ID
, AGE
, MEDICALRECORD
, ORGANIZATION
, HEALTHPLAN
, DOCTOR
, USERNAME
, LOCATION-OTHER
, URL
, DEVICE
, CITY
, ZIP
, STATE
, PATIENT
, COUNTRY
, STREET
, PHONE
, HOSPITAL
, EMAIL
, IDNUM
, BIOID
, FAX
, SSN
, ACCOUNT
, DLN
, PLATE
, VIN
, LICENSE
, IPADDR
entities. In this pipeline, there are ner_deid_generic_augmented
, ner_deid_subentity_augmented
, ner_deid_name_multilingual_clinical
NER models and several ContextualParser, RegexMatcher, and TextMatcher models were used.
Predicted Entities
LOCATION
, CONTACT
, PROFESSION
, NAME
, DATE
, ID
, AGE
, MEDICALRECORD
, ORGANIZATION
, HEALTHPLAN
, DOCTOR
, USERNAME
, LOCATION-OTHER
, URL
, DEVICE
, CITY
, ZIP
, STATE
, PATIENT
, COUNTRY
, STREET
, PHONE
, HOSPITAL
, EMAIL
, IDNUM
, BIOID
, FAX
, SSN
, ACCOUNT
, DLN
, PLATE
, VIN
, LICENSE
, IPADDR
How to use
from sparknlp.pretrained import PretrainedPipeline
deid_pipeline = PretrainedPipeline("ner_deid_context_nameAugmented_pipeline", "en", "clinical/models")
text = """Name : Hendrickson, Ora, Record date: 2093-01-13, MR: 719435.
Dr. John Green, IP 203.120.223.13.
He is a 60-year-old male was admitted to the Day Hospital for cystectomy on 01/13/93.
Patient's ID: 3454362, VIN : 1HGBH41JXMN109286, SSN #333-44-6666, Driver's license no: A334455B.
Phone (302) 786-5227, 0295 Keats Street, San Francisco, E-MAIL: smith@gmail.com."""
result = deid_pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val deid_pipeline = PretrainedPipeline("ner_deid_context_nameAugmented_pipeline", "en", "clinical/models")
val text = """Name : Hendrickson, Ora, Record date: 2093-01-13, MR: 719435.
Dr. John Green, IP 203.120.223.13.
He is a 60-year-old male was admitted to the Day Hospital for cystectomy on 01/13/93.
Patient's ID: 3454362, VIN : 1HGBH41JXMN109286, SSN #333-44-6666, Driver's license no: A334455B.
Phone (302) 786-5227, 0295 Keats Street, San Francisco, E-MAIL: smith@gmail.com."""
val result = deid_pipeline.fullAnnotate(text)
Results
| | chunk | begin | end | entity |
|---:|:------------------|--------:|------:|:--------------|
| 0 | Hendrickson, Ora | 7 | 22 | PATIENT |
| 1 | 2093-01-13 | 38 | 47 | DATE |
| 2 | 719435 | 54 | 59 | MEDICALRECORD |
| 3 | John Green | 66 | 75 | DOCTOR |
| 4 | 203.120.223.13 | 81 | 94 | IPADDR |
| 5 | 60 | 105 | 106 | AGE |
| 6 | Day Hospital | 142 | 153 | HOSPITAL |
| 7 | 01/13/93 | 173 | 180 | DATE |
| 8 | 3454362 | 197 | 203 | IDNUM |
| 9 | 1HGBH41JXMN109286 | 212 | 228 | VIN |
| 10 | #333-44-6666 | 235 | 246 | SSN |
| 11 | A334455B | 270 | 277 | DLN |
| 12 | (302) 786-5227 | 286 | 299 | PHONE |
| 13 | 0295 Keats Street | 302 | 318 | STREET |
| 14 | San Francisco | 321 | 333 | CITY |
| 15 | smith@gmail.com | 344 | 358 | EMAIL |
Model Information
Model Name: | ner_deid_context_nameAugmented_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 5.3.2+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 1.7 GB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- WordEmbeddingsModel
- MedicalNerModel
- NerConverter
- MedicalNerModel
- NerConverter
- MedicalNerModel
- NerConverter
- ChunkMergeModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- TextMatcherInternalModel
- ContextualParserModel
- RegexMatcherModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- ContextualParserModel
- RegexMatcherInternalModel
- ChunkMergeModel
- ChunkMergeModel