Specimen Contextual Parser Pipeline

Description

This pipeline, extracts specimen entities from clinical texts.

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("specimen_parser_pipeline", "en", "clinical/models")

sample_text = """ Specimen ID: AB123-456789 was collected from the patient.
The laboratory processed Specimen Number CD987654 yesterday.
Use Specimen Code: XYZ12-3456 for tracking purposes.
Sample was labeled as Specimen#EF34-789.
Specimen No. GH56-123456 was sent to the pathology department."""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

from johnsnowlabs import nlp, medical

pipeline = nlp.PretrainedPipeline("specimen_parser_pipeline", "en", "clinical/models")

sample_text = """ Specimen ID: AB123-456789 was collected from the patient.
The laboratory processed Specimen Number CD987654 yesterday.
Use Specimen Code: XYZ12-3456 for tracking purposes.
Sample was labeled as Specimen#EF34-789.
Specimen No. GH56-123456 was sent to the pathology department."""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = PretrainedPipeline("specimen_parser_pipeline", "en", "clinical/models")

val sample_text = """ Specimen ID: AB123-456789 was collected from the patient.
The laboratory processed Specimen Number CD987654 yesterday.
Use Specimen Code: XYZ12-3456 for tracking purposes.
Sample was labeled as Specimen#EF34-789.
Specimen No. GH56-123456 was sent to the pathology department."""

val result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

Results

| specimen_id   |   begin |   end | label    |
|:--------------|--------:|------:|:---------|
| AB123         |      13 |    18 | SPECIMEN |
| CD987654      |     100 |   108 | SPECIMEN |
| XYZ12-3456    |     140 |   150 | SPECIMEN |
| EF34-789      |     206 |   214 | SPECIMEN |
| GH56-123456   |     230 |   241 | SPECIMEN |

Model Information

Model Name:	specimen_parser_pipeline
Type:	pipeline
Compatibility:	Healthcare NLP 6.3.0+
License:	Licensed
Edition:	Official
Language:	en
Size:	396.5 KB

Included Models

DocumentAssembler
SentenceDetectorDLModel
TokenizerModel
ContextualParserModel
ChunkConverter

PREVIOUSPipeline to Mapping SNOMED Codes with Their Corresponding UMLS Codes

NEXTSSN Number Contextual Parser Pipeline