Mapping Phenotype Entities with Corresponding HPO Codes (Pretrained Pipeline)

Description

This pipeline is designed to map extracted phenotype entities from clinical or biomedical text to their corresponding Human Phenotype Ontology (HPO) codes. It ensures that observed symptoms, signs, and clinical abnormalities are standardized using HPO terminology.

Copy S3 URI

How to use


from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("hpo_mapper_pipeline", "en", "clinical/models")

result = pipeline.fullAnnotate("""APNEA: Presumed apnea of prematurity since < 34 wks gestation at birth.
HYPERBILIRUBINEMIA: At risk for hyperbilirubinemia d/t prematurity.
1/25-1/30: Received Amp/Gent while undergoing sepsis evaluation.""")


pipeline = nlp.PretrainedPipeline("hpo_mapper_pipeline", "en", "clinical/models")


result = pipeline.fullAnnotate("""APNEA: Presumed apnea of prematurity since < 34 wks gestation at birth.
HYPERBILIRUBINEMIA: At risk for hyperbilirubinemia d/t prematurity.
1/25-1/30: Received Amp/Gent while undergoing sepsis evaluation.""")


import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = PretrainedPipeline("hpo_mapper_pipeline", "en", "clinical/models")

val result = pipeline.fullAnnotate("""APNEA: Presumed apnea of prematurity since < 34 wks gestation at birth.
HYPERBILIRUBINEMIA: At risk for hyperbilirubinemia d/t prematurity.
1/25-1/30: Received Amp/Gent while undergoing sepsis evaluation.""")

Results


+------------------+-----+---+-----+----------+
|             chunk|begin|end|label|  hpo_code|
+------------------+-----+---+-----+----------+
|             APNEA|    0|  4|  HPO|HP:0002104|
|             apnea|   16| 20|  HPO|HP:0002104|
|HYPERBILIRUBINEMIA|   66| 83|  HPO|HP:0002904|
|hyperbilirubinemia|   91|108|  HPO|HP:0002904|
|            sepsis|  167|172|  HPO|HP:0100806|
+------------------+-----+---+-----+----------+

Model Information

Model Name: hpo_mapper_pipeline
Type: pipeline
Compatibility: Healthcare NLP 6.0.0+
License: Licensed
Edition: Official
Language: en
Size: 4.0 MB

Included Models

  • DocumentAssembler
  • TokenizerModel
  • StopWordsCleaner
  • TokenAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • TextMatcherInternalModel
  • ChunkMapperModel