Description
This model classifies the gender of the patient in the clinical document.
Predicted Entities
Female, Male, Unknown.
How to use
To classify your text, you can use this model as part of an nlp pipeline with the following stages: DocumentAssembler, BertSentenceEmbeddings (sbiobert_base_cased_mli), ClassifierDLModel.
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")\
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")\
.setMaxSentenceLength(512)
gender_classifier = ClassifierDLModel.pretrained("classifierdl_gender_sbert", "en", "clinical/models") \
.setInputCols(["document", "sentence_embeddings"]) \
.setOutputCol("class")
nlp_pipeline = Pipeline(stages=[document_assembler, sbert_embedder, gender_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([[""]]).toDF("text")))
annotations = light_pipeline.fullAnnotate("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val sentence_embeddings = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")
.setInputCols("document")
.setOutputCol("sentence_embeddings")
.setMaxSentenceLength(512)
val gender_classifier = ClassifierDLModel.pretrained("classifierdl_gender_sbert", "en", "clinical/models")
.setInputCols(Array("document", "sentence_embeddings"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_embeddings, gender_classifier))
val data = Seq("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""").toDS().toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.gender.sbert").predict("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")
Results
Female
Model Information
| Model Name: | classifierdl_gender_sbert |
| Type: | ClassifierDLModel |
| Compatibility: | Healthcare NLP 2.6.5 + |
| Edition: | Official |
| License: | Licensed |
| Input Labels: | [sentence_embeddings] |
| Output Labels: | [class] |
| Language: | [en] |
| Case sensitive: | True |
Data Source
This model is trained on more than four thousands clinical documents (radiology reports, pathology reports, clinical visits etc.), annotated internally.
Benchmarking
label precision recall f1-score support
Female 0.9224 0.8954 0.9087 239
Male 0.7895 0.8468 0.8171 124
Unknown 0.8077 0.7778 0.7925 54
accuracy - - 0.8657 417
macro-avg 0.8399 0.8400 0.8394 417
weighted-avg 0.8680 0.8657 0.8664 417