Description
This pipeline, extracts ZIP code entities from clinical texts.
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("zip_regex_matcher_pipeline", "en", "clinical/models")
sample_text = """ John Doe lives at 1234 Maple Street, Springfield, IL 62704. He works at 5678 Oak Avenue, Austin, TX 73301. His previous address was 4321 Pine Street, Los Angeles, CA 90001. His cousin Jane lives at 7890 Elm Street, Chicago, IL 60614."""
result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))
from johnsnowlabs import nlp, medical
pipeline = nlp.PretrainedPipeline("zip_regex_matcher_pipeline", "en", "clinical/models")
sample_text = """ John Doe lives at 1234 Maple Street, Springfield, IL 62704. He works at 5678 Oak Avenue, Austin, TX 73301. His previous address was 4321 Pine Street, Los Angeles, CA 90001. His cousin Jane lives at 7890 Elm Street, Chicago, IL 60614."""
result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = PretrainedPipeline("zip_regex_matcher_pipeline", "en", "clinical/models")
val sample_text = """ John Doe lives at 1234 Maple Street, Springfield, IL 62704. He works at 5678 Oak Avenue, Austin, TX 73301. His previous address was 4321 Pine Street, Los Angeles, CA 90001. His cousin Jane lives at 7890 Elm Street, Chicago, IL 60614."""
val result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))
Results
| chunk | begin | end | label |
|--------:|--------:|------:|:--------|
| 62704 | 53 | 57 | ZIP |
| 73301 | 100 | 104 | ZIP |
| 90001 | 166 | 170 | ZIP |
| 60614 | 227 | 231 | ZIP |
Model Information
| Model Name: | zip_regex_matcher_pipeline |
| Type: | pipeline |
| Compatibility: | Healthcare NLP 6.3.0+ |
| License: | Licensed |
| Edition: | Official |
| Language: | en |
| Size: | 396.7 KB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- ContextualParserModel
- ChunkConverter