Classifier for Genders - SBERT

Description

This model classifies the gender of the patient in the clinical document using context.

Predicted Entities

Female, Male, Unknown

Live Demo Open in Colab Download

How to use

document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")

sbert_embedder = BertSentenceEmbeddings\
     .pretrained("sbiobert_base_cased_mli", 'en', 'clinical/models')\
     .setInputCols(["document"])\
     .setOutputCol("sentence_embeddings")

gender_classifier = ClassifierDLModel.pretrained( 'classifierdl_gender_sbert', 'en', 'clinical/models') \
               .setInputCols(["document", "sentence_embeddings"]) \
               .setOutputCol("class")

nlp_pipeline = Pipeline(stages=[document_assembler, sbert_embedder, gender_classifier])

light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate("""social history: shows that  does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")

Results

 Female

Model Information

Model Name: classifierdl_gender_sbert
Compatibility: Spark NLP 2.7.1+
License: Licensed
Edition: Official
Input Labels: [sentence_embeddings]
Output Labels: [class]
Language: en
Dependencies: sbiobert_base_cased_mli

Data Source

This model is trained on more than four thousands clinical documents (radiology reports, pathology reports, clinical visits etc.), annotated internally.

Benchmarking

              precision    recall  f1-score   support

      Female     0.9390    0.9747    0.9565       237
        Male     0.9561    0.8720    0.9121       125
     Unknown     0.8491    0.8824    0.8654        51

    accuracy                         0.9322       413
   macro avg     0.9147    0.9097    0.9113       413
weighted avg     0.9331    0.9322    0.9318       413