Deidentification NER (Enriched) is a Named Entity Recognition model that annotates text to find protected health information that may need to be deidentified. Clinical NER is trained with the ‘embeddings_clinical’ word embeddings model, so be sure to use the same embeddings in the pipeline.
Age, City, Country, Date, Doctor, Hospital, Idnum, Medicalrecord, Organization, Patient, Phone, Profession, State, Street, Username, and Zip.
How to use
ner = NerDLModel.pretrained("ner_deid_enriched", "en") \ .setInputCols(["document", "token", "embeddings"]) \ .setOutputCol("ner")
val ner = NerDLModel.pretrained("ner_deid_enriched", "en") .setInputCols(Array("document", "token", "embeddings")) .setOutputCol("ner")
|Compatibility:||Spark NLP for Healthcare 2.4.2+|
|Input Labels:||[sentence, token, embeddings]|
The model is trained based on data from https://portal.dbmi.hms.harvard.edu/projects/n2c2-2014/