ZIP Code Contextual Parser Pipeline

Description

This pipeline, extracts ZIP code entities from clinical texts.

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("zip_regex_matcher_pipeline", "en", "clinical/models")

sample_text = """ John Doe lives at 1234 Maple Street, Springfield, IL 62704. He works at 5678 Oak Avenue, Austin, TX 73301. His previous address was 4321 Pine Street, Los Angeles, CA 90001. His cousin Jane lives at 7890 Elm Street, Chicago, IL 60614."""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

from johnsnowlabs import nlp, medical

pipeline = nlp.PretrainedPipeline("zip_regex_matcher_pipeline", "en", "clinical/models")

sample_text = """ John Doe lives at 1234 Maple Street, Springfield, IL 62704. He works at 5678 Oak Avenue, Austin, TX 73301. His previous address was 4321 Pine Street, Los Angeles, CA 90001. His cousin Jane lives at 7890 Elm Street, Chicago, IL 60614."""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = PretrainedPipeline("zip_regex_matcher_pipeline", "en", "clinical/models")

val sample_text = """ John Doe lives at 1234 Maple Street, Springfield, IL 62704. He works at 5678 Oak Avenue, Austin, TX 73301. His previous address was 4321 Pine Street, Los Angeles, CA 90001. His cousin Jane lives at 7890 Elm Street, Chicago, IL 60614."""

val result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

Results

|   chunk |   begin |   end | label   |
|--------:|--------:|------:|:--------|
|   62704 |      53 |    57 | ZIP     |
|   73301 |     100 |   104 | ZIP     |
|   90001 |     166 |   170 | ZIP     |
|   60614 |     227 |   231 | ZIP     |

Model Information

Model Name:	zip_regex_matcher_pipeline
Type:	pipeline
Compatibility:	Healthcare NLP 6.3.0+
License:	Licensed
Edition:	Official
Language:	en
Size:	396.7 KB

Included Models

DocumentAssembler
SentenceDetectorDLModel
TokenizerModel
ContextualParserModel
ChunkConverter

PREVIOUSPretrained Zero-Shot Named Entity Recognition (zeroshot_ner_deid_subentity_nonMedical_medium) - Pipeline

NEXTMapping UMLS Codes with Their Corresponding CPT Codes