Packages

package merge

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class ChunkMergeApproach extends AnnotatorApproach[ChunkMergeModel] with CheckLicense

    Merges two chunk columns coming from two annotators(NER, ContextualParser or any other annotator producing chunks).

    Merges two chunk columns coming from two annotators(NER, ContextualParser or any other annotator producing chunks). The merger of the two chunk columns is made by selecting one chunk from one of the columns according to certain criteria. The decision on which chunk to select is made according to the chunk indices in the source document. (chunks with longer lengths and highest information will be kept from each source) Labels can be changed by setReplaceDictResource.

    Example

    Define a pipeline with 2 different NER models with a ChunkMergeApproach at the end

    val data = Seq(("A 63-year-old man presents to the hospital ...")).toDF("text")
    val pipeline = new Pipeline().setStages(Array(
      new DocumentAssembler().setInputCol("text").setOutputCol("document"),
      new SentenceDetector().setInputCols("document").setOutputCol("sentence"),
      new Tokenizer().setInputCols("sentence").setOutputCol("token"),
      WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models").setOutputCol("embs"),
      MedicalNerModel.pretrained("ner_jsl", "en", "clinical/models")
        .setInputCols("sentence", "token", "embs").setOutputCol("jsl_ner"),
      new NerConverter().setInputCols("sentence", "token", "jsl_ner").setOutputCol("jsl_ner_chunk"),
      MedicalNerModel.pretrained("ner_bionlp", "en", "clinical/models")
        .setInputCols("sentence", "token", "embs").setOutputCol("bionlp_ner"),
      new NerConverter().setInputCols("sentence", "token", "bionlp_ner")
        .setOutputCol("bionlp_ner_chunk"),
      new ChunkMergeApproach().setInputCols("jsl_ner_chunk", "bionlp_ner_chunk").setOutputCol("merged_chunk")
    ))

    Show results

    val result = pipeline.fit(data).transform(data).cache()
    result.selectExpr("explode(merged_chunk) as a")
      .selectExpr("a.begin","a.end","a.result as chunk","a.metadata.entity as entity")
      .show(5, false)
    +-----+---+-----------+---------+
    |begin|end|chunk      |entity   |
    +-----+---+-----------+---------+
    |5    |15 |63-year-old|Age      |
    |17   |19 |man        |Gender   |
    |64   |72 |recurrent  |Modifier |
    |98   |107|cellulitis |Diagnosis|
    |110  |119|pneumonias |Diagnosis|
    +-----+---+-----------+---------+
  2. class ChunkMergeModel extends AnnotatorModel[ChunkMergeModel] with CheckLicense with HasSimpleAnnotate[ChunkMergeModel]

    Merges entities coming from different CHUNK annotations

  3. sealed abstract class Order extends AnyRef
  4. trait ReadablePretrainedChunkMerge extends ParamsAndFeaturesReadable[ChunkMergeModel] with HasPretrained[ChunkMergeModel]

Value Members

  1. object Asc extends Order with Product with Serializable
  2. object ChunkMergeModel extends ReadablePretrainedChunkMerge with Serializable
  3. object Desc extends Order with Product with Serializable

Ungrouped