Detect symptoms, treatments and other NERs in German

Description

This model can be used to detect symptoms, treatments and other entities in medical text in German language.

Predicted Entities

DIAGLAB_PROCEDURE, MEDICAL_SPECIFICATION, MEDICAL_DEVICE, MEASUREMENT, BIOLOGICAL_CHEMISTRY, BODY_FLUID, TIME_INFORMATION, LOCAL_SPECIFICATION, BIOLOGICAL_PARAMETER, PROCESS, MEDICATION, DOSING, DEGREE, MEDICAL_CONDITION, PERSON, TISSUE, STATE_OF_HEALTH, BODY_PART, TREATMENT

Live Demo Open in ColabDownload

How to use

Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, Tokenizer, WordEmbeddingsModel, NerDLModel. Add the NerConverter to the end of the pipeline to convert entity tokens into full entity chunks.


clinical_ner = NerDLModel.pretrained("ner_healthcare", "en", "clinical/models") \
  .setInputCols(["sentence", "token", "embeddings"]) \
  .setOutputCol("ner")

nlp_pipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter])

light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))

annotations = light_pipeline.fullAnnotate("Das Kleinzellige Bronchialkarzinom (Kleinzelliger Lungenkrebs, SCLC) ist ein hochmalignes bronchogenes Karzinom")

Results

+----+-------------------+---------+---------+--------------------------+
|    | chunk             |   begin |   end   | entity                   |
+====+===================+=========+=========+==========================+
|  0 | Kleinzellige      |      4  |    15   | MEDICAL_SPECIFICATION    |
+----+-------------------+---------+---------+--------------------------+
|  1 | Bronchialkarzinom |      17 |   33    | MEDICAL_CONDITION        |
+----+-------------------+---------+---------+--------------------------+
|  2 | Kleinzelliger     |      36 |    48   | MEDICAL_SPECIFICATION    |
+----+-------------------+---------+---------+--------------------------+
|  3 | Lungenkrebs       |      50 |   60    | MEDICAL_CONDITION        |
+----+-------------------+---------+---------+--------------------------+
|  4 | SCLC              |      63 |   66    | MEDICAL_CONDITION        |
+----+-------------------+---------+---------+--------------------------+
|  5 | hochmalignes      |      77 |    88   | MEASUREMENT              |
+----+-------------------+---------+---------+--------------------------+
|  6 | bronchogenes      |      90 |   101   | BODY_PART                |
+----+-------------------+---------+---------+--------------------------+
|  7 | Karzinom          |     103 |   110   | MEDICAL_CONDITION        |
+----+-------------------+---------+---------+--------------------------+

Model Information

Model Name: ner_healthcare
Type: ner
Compatibility: Spark NLP for Healthcare 2.6.0 +
Edition: Official
License: Licensed
Input Labels: [sentence, token, embeddings]
Output Labels: [ner]
Language: [de]
Case sensitive: false