Description
This model classifies the gender of the patient in the clinical document using context.
Predicted Entities
Female
, Male
, Unknown
Live Demo Open in Colab Copy S3 URI
How to use
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
sbert_embedder = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli", 'en', 'clinical/models')\
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")
gender_classifier = ClassifierDLModel.pretrained('classifierdl_gender_sbert', 'en', 'clinical/models') \
.setInputCols(["sentence_embeddings"])\
.setOutputCol("class")
nlp_pipeline = Pipeline(stages=[document_assembler, sbert_embedder, gender_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val sbert_embedder = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli","en","clinical/models")
.setInputCols(Array("document"))
.setOutputCol("sentence_embeddings")
val gender_classifier = ClassifierDLModel.pretrained("classifierdl_gender_sbert","en","clinical/models")
.setInputCols(Array("sentence_embeddings"))
.setOutputCol("class")
val nlp_pipeline = nnew Pipeline().setStages(Array(
document_assembler,
sbert_embedder,
gender_classifier))
val data = Seq("""social history: shows that does not smoke cigarettes or drink alcohol,lives in a nursing home. family history: shows a family history of breast cancer.""").toDF("text")
val result = nlp_pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.gender.sbert").predict("""social history: shows that does not smoke cigarettes or drink alcohol, lives in a nursing home. family history: shows a family history of breast cancer.""")
Results
Female
Model Information
Model Name: | classifierdl_gender_sbert |
Compatibility: | Spark NLP 2.7.1+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [class] |
Language: | en |
Dependencies: | sbiobert_base_cased_mli |
Data Source
This model is trained on more than four thousands clinical documents (radiology reports, pathology reports, clinical visits etc.), annotated internally.
Benchmarking
precision recall f1-score support
Female 0.9390 0.9747 0.9565 237
Male 0.9561 0.8720 0.9121 125
Unknown 0.8491 0.8824 0.8654 51
accuracy 0.9322 413
macro avg 0.9147 0.9097 0.9113 413
weighted avg 0.9331 0.9322 0.9318 413
PREVIOUSClassifier for Genders - BIOBERT
NEXTPICO Classifier