NER Pipeline Benchmark Medium (Document Wise)

Description

This pipeline can be used to detect PHI entities in medical texts using Named Entity Recognition (NER). It identifies various types of sensitive entities such as: ‘CONTACT’, ‘DATE’, ‘ID’, ‘LOCATION’, ‘PROFESSION’, ‘DOCTOR’, ‘EMAIL’, ‘PATIENT’, ‘URL’, ‘USERNAME’, ‘CITY’, ‘COUNTRY’, ‘DLN’, ‘HOSPITAL’, ‘IDNUM’, ‘LOCATION_OTHER’, ‘MEDICALRECORD’, ‘STATE’, ‘STREET’, ‘ZIP’, ‘AGE’, ‘PHONE’, ‘ORGANIZATION’, ‘SSN’, ‘ACCOUNT’, ‘PLATE’, ‘VIN’, ‘LICENSE’, and ‘IP’.

Copy S3 URI

How to use


from sparknlp.pretrained import PretrainedPipeline

deid_pipeline = PretrainedPipeline("ner_docwise_benchmark_medium", "en", "clinical/models")

text = """Dr. John Lee, from Royal Medical Clinic in Chicago, attended to the patient on 11/05/2024.
The patient’s medical record number is 56467890.
The patient, Emma Wilson, is 50 years old, her Contact number: 444-456-7890 ."""

deid_result = deid_pipeline.fullAnnotate(text)



from sparknlp.pretrained import PretrainedPipeline

deid_pipeline = nlp.PretrainedPipeline("ner_docwise_benchmark_medium", "en", "clinical/models")

text = """Dr. John Lee, from Royal Medical Clinic in Chicago, attended to the patient on 11/05/2024.
The patient’s medical record number is 56467890.
The patient, Emma Wilson, is 50 years old, her Contact number: 444-456-7890 ."""

deid_result = deid_pipeline.fullAnnotate(text)


import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val deid_pipeline = PretrainedPipeline("ner_docwise_benchmark_medium", "en", "clinical/models")

val text = """Dr. John Lee, from Royal Medical Clinic in Chicago, attended to the patient on 11/05/2024.
The patient’s medical record number is 56467890.
The patient, Emma Wilson, is 50 years old, her Contact number: 444-456-7890 ."""

val deid_result = deid_pipeline.fullAnnotate(text)

Results

|    | text                                                                                       | result                                                                                                                   |
|---:|:-------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|
|  0 | Dr. John Lee, from Royal Medical Clinic in Chicago, attended to the patient on 11/05/2024. | ['John Lee', 'Royal Medical Clinic', 'Chicago', '11/05/2024', '56467890', 'Emma Wilson', '50 years old', '444-456-7890'] |
|    | The patient’s medical record number is 56467890.                                           |                                                                                                                          |
|    | The patient, Emma Wilson, is 50 years old, her Contact number: 444-456-7890 .              |                                                                                                                          |

Model Information

Model Name: ner_docwise_benchmark_medium
Type: pipeline
Compatibility: Healthcare NLP 6.0.4+
License: Licensed
Edition: Official
Language: en
Size: 2.5 GB

Included Models

  • DocumentAssembler
  • InternalDocumentSplitter
  • TokenizerModel
  • TokenizerModel
  • WordEmbeddingsModel
  • MedicalNerModel
  • NerConverterInternalModel
  • PretrainedZeroShotNER
  • NerConverterInternalModel
  • MedicalNerModel
  • NerConverterInternalModel
  • MedicalNerModel
  • NerConverterInternalModel
  • ChunkMergeModel
  • ContextualParserModel
  • ContextualParserModel
  • ContextualParserModel
  • ContextualParserModel
  • ContextualParserModel
  • ContextualParserModel
  • ContextualParserModel
  • ContextualParserModel
  • RegexMatcherInternalModel
  • ContextualParserModel
  • ContextualParserModel
  • RegexMatcherInternalModel
  • RegexMatcherInternalModel
  • RegexMatcherInternalModel
  • ContextualParserModel
  • TextMatcherInternalModel
  • TextMatcherInternalModel
  • ContextualParserModel
  • ContextualParserModel
  • ChunkMergeModel
  • ChunkMergeModel