resolution

package resolution

Ordering

Alphabetic

Visibility

Public
All

Type Members

case class DistanceResult(distance: Double, weightedDistance: Double) extends Product with Serializable
Class that contains distance in both representations: weighted and non-weighted, for using later in DistancePooling
class JDataReader extends AnyRef
case class JTreeComponent(embeddings: Array[Float], data: JTreeData) extends Product with Serializable
case class JTreeData(code: String, trained: Array[String], normalized: String) extends Product with Serializable
class JTreeReader extends StorageReader[JTreeComponent]
class JTreeWriter extends StorageBatchWriter[JTreeComponent]
trait ReadablePretrainedSentenceEntityResolver extends ParamsAndFeaturesReadable[SentenceEntityResolverModel] with HasPretrained[SentenceEntityResolverModel] with EvalEntityResolver

class Resolution2Chunk extends AnnotatorModel[Resolution2Chunk] with HasSimpleAnnotate[Resolution2Chunk]

This annotator converts 'Resolution' type annotations into 'CHUNK' type to create new chunk-type column, compatible with annotators that use chunk type as input.

Example

Define dataset

val testDS = Seq(
"Has a past history of gastroenteritis and stomach pain, however patient shows no stomach pain now. " +
"We don't care about gastroenteritis here, but we do care about heart failure. " +
"Test for asma, no asma.").toDF("text")

Define a pipeline

val documentAssembler = new DocumentAssembler().setInputCol("text").setOutputCol("ner_chunk")

val sbert_embedder =  BertSentenceEmbeddings
 .pretrained("sbiobert_base_cased_mli","en","clinical/models" )
 .setInputCols(Array("ner_chunk"))
 .setOutputCol("sentence_embeddings")
 .setCaseSensitive(false)

val resolver = SentenceEntityResolverModel
 .pretrained("sbiobertresolve_rxnorm_augmented", "en", "clinical/models")
 .setInputCols(Array("sentence_embeddings"))
 .setOutputCol("resolve")

val resolver2chunk = new Resolution2Chunk()
 .setInputCols(Array("resolve"))
 .setOutputCol("chunk")

val pipeline = new Pipeline().setStages(Array(documentAssembler, sbert_embedder, resolver, resolver2chunk)).fit(testDS)

val result = pipeline.transform(testDS).selectExpr("chunk.result","chunk.annotatorType").show(false)

  +---------+-------------+
  |result   |annotatorType|
  +---------+-------------+
  |[2550737]|[chunk]      |
  +---------+-------------+

class ResolverMerger extends AnnotatorModel[ResolverMerger] with HasSimpleAnnotate[ResolverMerger] with CheckLicense
class SentenceEntityResolverApproach extends AnnotatorApproach[SentenceEntityResolverModel] with SentenceResolverParams with HasCaseSensitiveProperties with HandleExceptionParams with CheckLicense
Contains all the parameters and methods to train a SentenceEntityResolverModel.
Contains all the parameters and methods to train a SentenceEntityResolverModel. The model transforms a dataset with Input Annotation type SENTENCE_EMBEDDINGS, coming from e.g. BertSentenceEmbeddings and returns the normalized entity for a particular trained ontology / curated dataset. (e.g. ICD-10, RxNorm, SNOMED etc.)
To use pretrained models please use SentenceEntityResolverModel and see the Models Hub for available models.
Example
Training a SNOMED resolution model using BERT sentence embeddings
Define pre-processing pipeline for training data. It needs consists of columns for the normalized training data and their labels.
```
val documentAssembler = new DocumentAssembler()
   .setInputCol("normalized_text")
   .setOutputCol("document")
 val bertEmbeddings = BertSentenceEmbeddings.pretrained("sent_biobert_pubmed_base_cased")
   .setInputCols("sentence")
   .setOutputCol("bert_embeddings")
 val snomedTrainingPipeline = new Pipeline().setStages(Array(
   documentAssembler,
   bertEmbeddings
 ))
 val snomedTrainingModel = snomedTrainingPipeline.fit(data)
 val snomedData = snomedTrainingModel.transform(data).cache()
```
Then the Resolver can be trained with
```
val bertExtractor = new SentenceEntityResolverApproach()
  .setNeighbours(25)
  .setThreshold(1000)
  .setInputCols("bert_embeddings")
  .setNormalizedCol("normalized_text")
  .setLabelCol("label")
  .setOutputCol("snomed_code")
  .setDistanceFunction("EUCLIDIAN")
  .setCaseSensitive(false)

val snomedModel = bertExtractor.fit(snomedData)
```
See also
SentenceEntityResolverModel

class SentenceEntityResolverModel extends AnnotatorModel[SentenceEntityResolverModel] with SentenceResolverParams with HasStorageModel with HasEmbeddingsProperties with HasCaseSensitiveProperties with HasSimpleAnnotate[SentenceEntityResolverModel] with HandleExceptionParams with HasSafeAnnotate[SentenceEntityResolverModel] with CheckLicense

The model transforms a dataset with Input Annotation type SENTENCE_EMBEDDINGS, coming from e.g.

The model transforms a dataset with Input Annotation type SENTENCE_EMBEDDINGS, coming from e.g. BertSentenceEmbeddings and returns the normalized entity for a particular trained ontology / curated dataset. (e.g. ICD-10, RxNorm, SNOMED etc.)

To use pretrained models please see the Models Hub for available models.

Example

Resolving CPT

First define pipeline stages to extract entities

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")
val sentenceDetector = SentenceDetectorDLModel.pretrained()
  .setInputCols("document")
  .setOutputCol("sentence")
val tokenizer = new Tokenizer()
  .setInputCols("sentence")
  .setOutputCol("token")
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
  .setInputCols("sentence", "token")
  .setOutputCol("embeddings")
val clinical_ner = MedicalNerModel.pretrained("jsl_ner_wip_clinical", "en", "clinical/models")
  .setInputCols("sentence", "token", "embeddings")
  .setOutputCol("ner")
val ner_converter = new NerConverter()
  .setInputCols("sentence", "token", "ner")
  .setOutputCol("ner_chunk")
  .setWhiteList("Test","Procedure")
val c2doc = new Chunk2Doc()
  .setInputCols("ner_chunk")
  .setOutputCol("ner_chunk_doc")
val sbert_embedder = BertSentenceEmbeddings
  .pretrained("sbiobert_base_cased_mli","en","clinical/models")
  .setInputCols("ner_chunk_doc")
  .setOutputCol("sbert_embeddings")

Then the resolver is defined on the extracted entities and sentence embeddings

val cpt_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_cpt_procedures_augmented","en", "clinical/models")
  .setInputCols("sbert_embeddings")
  .setOutputCol("cpt_code")
  .setDistanceFunction("EUCLIDEAN")
val sbert_pipeline_cpt = new Pipeline().setStages(Array(
  documentAssembler,
  sentenceDetector,
  tokenizer,
  word_embeddings,
  clinical_ner,
  ner_converter,
  c2doc,
  sbert_embedder,
  cpt_resolver))

Show results

sbert_outputs
  .select("explode(arrays_zip(ner_chunk.result ,ner_chunk.metadata, cpt_code.result, cpt_code.metadata, ner_chunk.begin, ner_chunk.end)) as cpt_code")
  .selectExpr(
    "cpt_code['0'] as chunk",
    "cpt_code['1'].entity as entity",
    "cpt_code['2'] as code",
    "cpt_code['3'].confidence as confidence",
    "cpt_code['3'].all_k_resolutions as all_k_resolutions",
    "cpt_code['3'].all_k_results as all_k_results"
  ).show(5)
+--------------------+---------+-----+----------+--------------------+--------------------+
|               chunk|   entity| code|confidence|   all_k_resolutions|         all_k_codes|
+--------------------+---------+-----+----------+--------------------+--------------------+
|          heart cath|Procedure|93566|    0.1180|CCA - Cardiac cat...|93566:::62319:::9...|
|selective coronar...|     Test|93460|    0.1000|Coronary angiogra...|93460:::93458:::9...|
|common femoral an...|     Test|35884|    0.1808|Femoral artery by...|35884:::35883:::3...|
|   StarClose closure|Procedure|33305|    0.1197|Heart closure:::H...|33305:::33300:::3...|
|         stress test|     Test|93351|    0.2795|Cardiovascular st...|93351:::94621:::9...|
+--------------------+---------+-----+----------+--------------------+--------------------+

See also: SentenceEntityResolverApproach for training a custom model

case class TreeData(code: String, trained: Array[String], normalized: String) extends Product with Serializable

Value Members

object ConfidenceFunction
Helper object to use while setting confidenceFunction parameter
object DistanceFunction
Helper object to use while setting distanceFunction parameter
object PoolingStrategy
Helper object to use while setting poolingStrategy parameter
object Resolution2Chunk extends DefaultParamsReadable[Resolution2Chunk] with Serializable
object ResolverMerger extends ParamsAndFeaturesReadable[ResolverMerger] with Serializable
object SentenceEntityResolverModel extends ReadablePretrainedSentenceEntityResolver with Serializable

Packages

resolution

package resolution

Type Members

Example

Example

Training a SNOMED resolution model using BERT sentence embeddings

Example

Resolving CPT

Value Members

Ungrouped

Packages

resolution 

package resolution

Type Members

Example

Example

Training a SNOMED resolution model using BERT sentence embeddings

Example

Resolving CPT

Value Members

Ungrouped

resolution