Email Regex Matcher

Description

This model extracts emails in clinical notes using rule-based RegexMatcherInternal annotator.

Predicted Entities

How to use

documentAssembler = DocumentAssembler()\
      .setInputCol("text")\
      .setOutputCol("document")

email_regex_matcher = RegexMatcherInternalModel.pretrained("email_matcher","en","clinical/models") \
    .setInputCols(["document"])\
    .setOutputCol("EMAIL")\

email_regex_matcher_pipeline = Pipeline(
    stages=[
        documentAssembler,
        email_regex_matcher
        ])

data = spark.createDataFrame([["""ID: 1231511863, The driver's license no:A334455B, the SSN:324598674 and info@domain.net, mail: tech@support.org, e-mail: hale@gmail.com .
 E-mail: sales@gmail.com."""]]).toDF("text")


email_regex_matcher_model = email_regex_matcher_pipeline.fit(data)
result = email_regex_matcher_model.transform(data)

val documentAssembler = new DocumentAssembler()
	.setInputCol("text")
	.setOutputCol("document")

val email_regex_matcher = RegexMatcherInternalModel.pretrained("email_matcher","en","clinical/models")
	.setInputCols(Array("document"))
	.setOutputCol("EMAIL")

val email_regex_pipeline = new Pipeline().setStages(Array(
		documentAssembler,
		email_regex_matcher
  ))

val data = Seq("""ID: 1231511863, The driver's license no:A334455B, the SSN:324598674 and info@domain.net, mail: tech@support.org, e-mail: hale@gmail.com .
 E-mail: sales@gmail.com.""").toDF("text")

val result = email_regex_pipeline.fit(data).transform(data)

Results

+----------------+-----+---+-----+
|chunk           |begin|end|label|
+----------------+-----+---+-----+
|info@domain.net |72   |86 |EMAIL|
|tech@support.org|95   |110|EMAIL|
|hale@gmail.com  |121  |134|EMAIL|
|sales@gmail.com |147  |161|EMAIL|
+----------------+-----+---+-----+

Model Information

Model Name:	email_matcher
Compatibility:	Healthcare NLP 5.4.0+
License:	Licensed
Edition:	Official
Input Labels:	[document]
Output Labels:	[EMAIL]
Language:	en
Size:	2.2 KB

PREVIOUSJSL_MedS_Rag_v1 (LLM - q8)

NEXTIP Regex Matcher