Email Regex Matcher

Description

This pipeline, extracts emails in clinical notes using rule-based RegexMatcherInternal annotator.

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("email_regex_matcher_pipeline", "en", "clinical/models")

sample_text = """ ID: 1231511863, The driver's license no:A334455B, the SSN:324598674 and info@domain.net, mail: tech@support.org, e-mail: hale@gmail.com .
 E-mail: sales@gmail.com.
"""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

from johnsnowlabs import nlp, medical

pipeline = nlp.PretrainedPipeline("email_regex_matcher_pipeline", "en", "clinical/models")

sample_text = """ ID: 1231511863, The driver's license no:A334455B, the SSN:324598674 and info@domain.net, mail: tech@support.org, e-mail: hale@gmail.com .
 E-mail: sales@gmail.com.
"""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = PretrainedPipeline("email_regex_matcher_pipeline", "en", "clinical/models")

val sample_text = """ ID: 1231511863, The driver's license no:A334455B, the SSN:324598674 and info@domain.net, mail: tech@support.org, e-mail: hale@gmail.com .
 E-mail: sales@gmail.com.
"""

val result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

Results

| chunk            |   begin |   end | label   |
|:-----------------|--------:|------:|:--------|
| info@domain.net  |      72 |    86 | EMAIL   |
| tech@support.org |      95 |   110 | EMAIL   |
| hale@gmail.com   |     121 |   134 | EMAIL   |
| sales@gmail.com  |     147 |   161 | EMAIL   |

Model Information

Model Name:	email_regex_matcher_pipeline
Type:	pipeline
Compatibility:	Healthcare NLP 6.3.0+
License:	Licensed
Edition:	Official
Language:	en
Size:	6.9 KB

Included Models

DocumentAssembler
RegexMatcherInternalModel
ChunkConverter

PREVIOUSDrug Strength Contextual Parser Pipeline

NEXTGeneric Classifier for Measurement - Pipeline