Description
This model classifies the gender of the patient in the clinical document.
Predicted Entities
Female
, Male
, Unknown
.
How to use
To classify your text, you can use this model as part of an nlp pipeline with the following stages: DocumentAssembler, BertSentenceEmbeddings (sbiobert_base_cased_mli
), ClassifierDLModel.
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")\
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")\
.setMaxSentenceLength(512)
gender_classifier = ClassifierDLModel.pretrained("classifierdl_gender_sbert", "en", "clinical/models") \
.setInputCols(["document", "sentence_embeddings"]) \
.setOutputCol("class")
nlp_pipeline = Pipeline(stages=[document_assembler, sbert_embedder, gender_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([[""]]).toDF("text")))
annotations = light_pipeline.fullAnnotate("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val sentence_embeddings = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")
.setInputCols("document")
.setOutputCol("sentence_embeddings")
.setMaxSentenceLength(512)
val gender_classifier = ClassifierDLModel.pretrained("classifierdl_gender_sbert", "en", "clinical/models")
.setInputCols(Array("document", "sentence_embeddings"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_embeddings, gender_classifier))
val data = Seq("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""").toDS().toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.gender.sbert").predict("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")
Results
Female
Model Information
Model Name: | classifierdl_gender_sbert |
Type: | ClassifierDLModel |
Compatibility: | Healthcare NLP 2.6.5 + |
Edition: | Official |
License: | Licensed |
Input Labels: | [sentence_embeddings] |
Output Labels: | [class] |
Language: | [en] |
Case sensitive: | True |
Data Source
This model is trained on more than four thousands clinical documents (radiology reports, pathology reports, clinical visits etc.), annotated internally.
Benchmarking
label precision recall f1-score support
Female 0.9224 0.8954 0.9087 239
Male 0.7895 0.8468 0.8171 124
Unknown 0.8077 0.7778 0.7925 54
accuracy - - 0.8657 417
macro-avg 0.8399 0.8400 0.8394 417
weighted-avg 0.8680 0.8657 0.8664 417