Specimen Contextual Parser Pipeline

Description

This pipeline, extracts specimen entities from clinical texts.

Copy S3 URI

How to use


from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("specimen_parser_pipeline", "en", "clinical/models")

sample_text = """ Specimen ID: AB123-456789 was collected from the patient.
The laboratory processed Specimen Number CD987654 yesterday.
Use Specimen Code: XYZ12-3456 for tracking purposes.
Sample was labeled as Specimen#EF34-789.
Specimen No. GH56-123456 was sent to the pathology department."""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))


from johnsnowlabs import nlp, medical

pipeline = nlp.PretrainedPipeline("specimen_parser_pipeline", "en", "clinical/models")

sample_text = """ Specimen ID: AB123-456789 was collected from the patient.
The laboratory processed Specimen Number CD987654 yesterday.
Use Specimen Code: XYZ12-3456 for tracking purposes.
Sample was labeled as Specimen#EF34-789.
Specimen No. GH56-123456 was sent to the pathology department."""

result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))


import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = PretrainedPipeline("specimen_parser_pipeline", "en", "clinical/models")

val sample_text = """ Specimen ID: AB123-456789 was collected from the patient.
The laboratory processed Specimen Number CD987654 yesterday.
Use Specimen Code: XYZ12-3456 for tracking purposes.
Sample was labeled as Specimen#EF34-789.
Specimen No. GH56-123456 was sent to the pathology department."""

val result = pipeline.transform(spark.createDataFrame([[sample_text]]).toDF("text"))

Results


| specimen_id   |   begin |   end | label    |
|:--------------|--------:|------:|:---------|
| AB123         |      13 |    18 | SPECIMEN |
| CD987654      |     100 |   108 | SPECIMEN |
| XYZ12-3456    |     140 |   150 | SPECIMEN |
| EF34-789      |     206 |   214 | SPECIMEN |
| GH56-123456   |     230 |   241 | SPECIMEN |

Model Information

Model Name: specimen_parser_pipeline
Type: pipeline
Compatibility: Healthcare NLP 6.3.0+
License: Licensed
Edition: Official
Language: en
Size: 396.5 KB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • ContextualParserModel
  • ChunkConverter