Mapping Phenotype Entities with Corresponding HPO Codes (Pretrained Pipeline)

Description

This pipeline is designed to map extracted phenotype entities from clinical or biomedical text to their corresponding Human Phenotype Ontology (HPO) codes. It ensures that observed symptoms, signs, and clinical abnormalities are standardized using HPO terminology.

Download Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("hpo_mapper_pipeline", "en", "clinical/models")

result = pipeline.fullAnnotate("""APNEA: Presumed apnea of prematurity since < 34 wks gestation at birth.
HYPERBILIRUBINEMIA: At risk for hyperbilirubinemia d/t prematurity.
1/25-1/30: Received Amp/Gent while undergoing sepsis evaluation.""")

pipeline = nlp.PretrainedPipeline("hpo_mapper_pipeline", "en", "clinical/models")


result = pipeline.fullAnnotate("""APNEA: Presumed apnea of prematurity since < 34 wks gestation at birth.
HYPERBILIRUBINEMIA: At risk for hyperbilirubinemia d/t prematurity.
1/25-1/30: Received Amp/Gent while undergoing sepsis evaluation.""")

import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = PretrainedPipeline("hpo_mapper_pipeline", "en", "clinical/models")

val result = pipeline.fullAnnotate("""APNEA: Presumed apnea of prematurity since < 34 wks gestation at birth.
HYPERBILIRUBINEMIA: At risk for hyperbilirubinemia d/t prematurity.
1/25-1/30: Received Amp/Gent while undergoing sepsis evaluation.""")

Results

+------------------+-----+---+-----+----------+
|             chunk|begin|end|label|  hpo_code|
+------------------+-----+---+-----+----------+
|             APNEA|    0|  4|  HPO|HP:0002104|
|             apnea|   16| 20|  HPO|HP:0002104|
|HYPERBILIRUBINEMIA|   66| 83|  HPO|HP:0002904|
|hyperbilirubinemia|   91|108|  HPO|HP:0002904|
|            sepsis|  167|172|  HPO|HP:0100806|
+------------------+-----+---+-----+----------+

Model Information

Model Name:	hpo_mapper_pipeline
Type:	pipeline
Compatibility:	Healthcare NLP 6.0.0+
License:	Licensed
Edition:	Official
Language:	en
Size:	4.0 MB

Included Models

DocumentAssembler
TokenizerModel
StopWordsCleaner
TokenAssembler
SentenceDetectorDLModel
TokenizerModel
TextMatcherInternalModel
ChunkMapperModel

PREVIOUSMapping Phenotype Entities with Corresponding HPO Codes (Pretrained Pipeline)

NEXTStop Words Cleaner for HPO