Packages

class AssertionMerger extends AnnotatorModel[AssertionMerger] with HasSimpleAnnotate[AssertionMerger] with WhiteAndBlackListParams with AssertionPrioritizationParams

Merges variety assertion columns coming from Assertion annotators like com.johnsnowlabs.nlp.annotators.assertion.dl.AssertionDLModel.

See also

com.johnsnowlabs.nlp.annotators.assertion.dl.AssertionDLModel AssertionMerger can filter, prioritize and merge assertion annotations by using proper parameters.

AssertionPrioritizationParams

WhiteAndBlackListParams

Example

val document_assembler = new DocumentAssembler()
 .setInputCol("text").setOutputCol("document")
val sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl_healthcare", "en", "clinical/models")
 .setInputCols(Array("document")).setOutputCol("sentence")
val tokenizer = new Tokenizer()
 .setInputCols(Array("sentence")).setOutputCol("token")
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
 .setInputCols(Array("sentence", "token")).setOutputCol("embeddings")
val ner_model = MedicalNerModel.pretrained("ner_opioid", "en", "clinical/models")
 .setInputCols(Array("sentence", "token", "embeddings")).setOutputCol("ner")
val ner_converter = new NerConverterInternal()
 .setInputCols(Array("sentence", "token", "ner")).setOutputCol("ner_chunk")
 .setWhiteList(Array("opioid_drug", "other_drug"))
val assertion = AssertionDLModel.pretrained("assertion_opioid_drug_status_wip", "en", "clinical/models")
 .setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion")
val assertion2 = AssertionDLModel.pretrained("assertion_opioid_wip", "en", "clinical/models")
 .setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion2")

val assertion_merger = new AssertionMerger()
 .setInputCols("assertion", "assertion2")
 .setOutputCol("assertion_merger")
 .setMergeOverlapping(true)
 .setSelectionStrategy("Sequential")
 .setAssertionSourcePrecedence("assertion2,assertion")
 .setBlackList(Array("HYPothetical"))
 .setCaseSensitive(false)
 .setAssertionsConfidence(Map("history" -> 0.80f))
 .setOrderingFeatures(Array("length", "source", "confidence"))

val pipeline = new Pipeline().setStages(Array(document_assembler,
 sentence_detector,
 tokenizer,
 word_embeddings,
 ner_model,
 ner_converter,
 assertion,
 assertion2,
 assertion_merger))

val data = Seq("""The patient presented to the hospital for a neurological evaluation, with a documented prescription for Percocet to manage chronic back pain. Assessment revealed ongoing discomfort localized to the lumbar region, with associated numbness and tingling in the lower extremities.""",
 """The patient, with a known history of hypertension managed with atenolol 50mg and verapamil 40mg, presented after a fall resulting in an ankle injury. Examination revealed swelling and tenderness, indicative of a twisted ankle. Considering the patient's medical history and pain management needs, a prescription for tramadol was provided to alleviate discomfort while ensuring minimal impact on blood pressure control.""",
 """The patient presented to the rehabilitation facility with a documented history of opioid abuse, primarily stemming from misuse of prescription percocet pills intended for their partner's use. Initial assessment revealed withdrawal symptoms consistent with opioid dependency, including agitation, diaphoresis, and myalgias.""",
 """The patient presented to the emergency department following an overdose on cocaine. On examination, the patient displayed signs of sympathetic nervous system stimulation, including tachycardia, hypertension, dilated pupils, and agitation.""",
 """The patient, with a documented history of chronic pain syndrome, was admitted following an accidental overdose of prescribed OxyContin. Upon assessment, the patient displayed symptoms indicative of opioid toxicity, including respiratory depression, altered mental status, and pinpoint pupils. Immediate resuscitative measures were undertaken, including airway management, administration of naloxone, and close monitoring of vital signs.""")
.toDF("text")

Show results

 val resultDF = pipeline.fit(data).transform(data)
 resultDF.selectExpr("explode(assertion_merger) as merger").show(false)
+---------------------------------------------------------------------------------------------------------------------+
|merger                                                                                                               |
+---------------------------------------------------------------------------------------------------------------------+
|{assertion, 104, 111, present, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.9802}, []}|
|{assertion, 63, 70, history, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.8833}, []}  |
|{assertion, 143, 150, present, {sentence -> 0, chunk -> 1, assertion_source -> assertion2, confidence -> 0.905}, []} |
|{assertion, 256, 261, present, {sentence -> 1, chunk -> 2, assertion_source -> assertion2, confidence -> 0.5283}, []}|
|{assertion, 75, 81, present, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.6853}, []}  |
|{assertion, 125, 133, present, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.5923}, []}|
|{assertion, 198, 203, present, {sentence -> 1, chunk -> 1, assertion_source -> assertion2, confidence -> 0.8479}, []}|
+---------------------------------------------------------------------------------------------------------------------+
Linear Supertypes
AssertionPrioritizationParams, WhiteAndBlackListParams, HasSimpleAnnotate[AssertionMerger], AnnotatorModel[AssertionMerger], CanBeLazy, RawAnnotator[AssertionMerger], HasOutputAnnotationCol, HasInputAnnotationCols, HasOutputAnnotatorType, ParamsAndFeaturesWritable, HasFeatures, DefaultParamsWritable, MLWritable, Model[AssertionMerger], Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. AssertionMerger
  2. AssertionPrioritizationParams
  3. WhiteAndBlackListParams
  4. HasSimpleAnnotate
  5. AnnotatorModel
  6. CanBeLazy
  7. RawAnnotator
  8. HasOutputAnnotationCol
  9. HasInputAnnotationCols
  10. HasOutputAnnotatorType
  11. ParamsAndFeaturesWritable
  12. HasFeatures
  13. DefaultParamsWritable
  14. MLWritable
  15. Model
  16. Transformer
  17. PipelineStage
  18. Logging
  19. Params
  20. Serializable
  21. Serializable
  22. Identifiable
  23. AnyRef
  24. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Parameters

  1. val applyFilterBeforeMerge: BooleanParam

    Whether to apply filtering before merging process.

    Whether to apply filtering before merging process. If true, filtering will be applied before merging; if false, filtering will be applied after merging process. Default: false.

  2. val assertionSourcePrecedence: Param[String]

    Specifies the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized.

    Specifies the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized. This parameter contains a comma-separated list of assertion sources that drive the prioritization. Annotations will be prioritized based on the order of the given string.

    Definition Classes
    AssertionPrioritizationParams
  3. val assertionsConfidence: MapFeature[String, Float]

    Pairs (assertion,confidenceThreshold) to filter assertions which have confidence lower than the confidence threshold.

  4. val caseSensitive: BooleanParam

    Determines whether the definitions of the white listed and black listed entities are case sensitive or not.

    Determines whether the definitions of the white listed and black listed entities are case sensitive or not. Default: true

    Definition Classes
    WhiteAndBlackListParams
  5. val defaultConfidence: FloatParam

    When the confidence value is included in the orderingFeatures and a given annotation does not have any confidence, this parameter determines the value to be used.

    When the confidence value is included in the orderingFeatures and a given annotation does not have any confidence, this parameter determines the value to be used. The default value is 0f.

    Definition Classes
    AssertionPrioritizationParams
  6. val majorityVoting: BooleanParam

    Whether to use majority voting to resolve conflicts.

    Whether to use majority voting to resolve conflicts. Default is false. It is used to resolve conflicts when there are more than 2 annotations in the same overlapping group. When confidence is used for ordering features, confidence values sum is used for majority voting.

  7. val mergeOverlapping: BooleanParam

    Whether to merge overlapping matched assertion annotations.

    Whether to merge overlapping matched assertion annotations. Default: true

  8. val orderingFeatures: StringArrayParam

    Specifies the ordering features to use for overlapping entities.

    Specifies the ordering features to use for overlapping entities. Possible values include: 'begin', 'end', 'length', 'source', 'confidence'. Default: Array("begin", "length", "source")

    Definition Classes
    AssertionPrioritizationParams
  9. val selectionStrategy: Param[String]

    Determines the strategy for selecting annotations.

    Determines the strategy for selecting annotations.

    Annotations can be selected either sequentially based on their order (Sequential) or using a more diverse strategy (DiverseLonger). Currently, only Sequential and DiverseLonger options are available. The default strategy is Sequential.

    Definition Classes
    AssertionPrioritizationParams
  10. val sortByBegin: BooleanParam

    Whether to sort the annotations by begin at the end of the merge and filter process.

    Whether to sort the annotations by begin at the end of the merge and filter process. Default: false

  11. val whiteList: StringArrayParam

    If defined, list of entities to process.

    If defined, list of entities to process. The rest will be ignored. Should not include IOB prefix on labels. Default: Array()

    Definition Classes
    WhiteAndBlackListParams

Members

  1. type AnnotatorType = String
    Definition Classes
    HasOutputAnnotatorType
  1. def annotate(annotations: Seq[Annotation]): Seq[Annotation]

    annotations

    The annotations per row that we need to merge and filter. Annotations should be ASSERTION type.

    returns

    The merged and filtered annotations of ASSERTION.

    Definition Classes
    AssertionMerger → HasSimpleAnnotate
  2. lazy val assertionsConfidenceMap: Map[String, Float]
  3. final def clear(param: Param[_]): AssertionMerger.this.type
    Definition Classes
    Params
  4. def copy(extra: ParamMap): AssertionMerger
    Definition Classes
    RawAnnotator → Model → Transformer → PipelineStage → Params
  5. def dfAnnotate: UserDefinedFunction
    Definition Classes
    HasSimpleAnnotate
  6. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  7. def explainParams(): String
    Definition Classes
    Params
  8. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  9. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  10. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  11. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  12. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  13. def getInputCols: Array[String]
    Definition Classes
    HasInputAnnotationCols
  14. def getLazyAnnotator: Boolean
    Definition Classes
    CanBeLazy
  15. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  16. final def getOutputCol: String
    Definition Classes
    HasOutputAnnotationCol
  17. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  18. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  19. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  20. def hasParent: Boolean
    Definition Classes
    Model
  21. val inputAnnotatorTypes: Array[String]
    Definition Classes
    AssertionMerger → HasInputAnnotationCols
  22. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  23. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  24. val lazyAnnotator: BooleanParam
    Definition Classes
    CanBeLazy
  25. val optionalInputAnnotatorTypes: Array[String]
    Definition Classes
    HasInputAnnotationCols
  26. val outputAnnotatorType: AnnotatorType
    Definition Classes
    AssertionMerger → HasOutputAnnotatorType
  27. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  28. var parent: Estimator[AssertionMerger]
    Definition Classes
    Model
  29. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  30. final def set[T](param: Param[T], value: T): AssertionMerger.this.type
    Definition Classes
    Params
  31. def setAllowList(list: String*): AssertionMerger.this.type
    Definition Classes
    WhiteAndBlackListParams
  32. def setAllowList(list: Array[String]): AssertionMerger.this.type
    Definition Classes
    WhiteAndBlackListParams
  33. def setBlackList(list: String*): AssertionMerger.this.type
    Definition Classes
    WhiteAndBlackListParams
  34. def setDenyList(list: String*): AssertionMerger.this.type
    Definition Classes
    WhiteAndBlackListParams
  35. def setDenyList(list: Array[String]): AssertionMerger.this.type
    Definition Classes
    WhiteAndBlackListParams
  36. final def setInputCols(value: String*): AssertionMerger.this.type
    Definition Classes
    HasInputAnnotationCols
  37. def setLazyAnnotator(value: Boolean): AssertionMerger.this.type
    Definition Classes
    CanBeLazy
  38. final def setOutputCol(value: String): AssertionMerger.this.type
    Definition Classes
    HasOutputAnnotationCol
  39. def setParent(parent: Estimator[AssertionMerger]): AssertionMerger
    Definition Classes
    Model
  40. def setWhiteList(list: String*): AssertionMerger.this.type
    Definition Classes
    WhiteAndBlackListParams
  41. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  42. final def transform(dataset: Dataset[_]): DataFrame
    Definition Classes
    AnnotatorModel → Transformer
  43. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  44. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  45. final def transformSchema(schema: StructType): StructType
    Definition Classes
    RawAnnotator → PipelineStage
  46. val uid: String
    Definition Classes
    AssertionMerger → Identifiable
  47. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable

Parameter setters

  1. val blackList: StringArrayParam

    If defined, list of entities to ignore.

    If defined, list of entities to ignore. The rest will be processed. Should not include IOB prefix on labels. Default: Array()

    Definition Classes
    WhiteAndBlackListParams
  2. def setApplyFilterBeforeMerge(value: Boolean): AssertionMerger.this.type

    Sets whether to apply filtering before merging process.

    Sets whether to apply filtering before merging process. If true, filtering will be applied before merging; if false, filtering will be applied after merging process. Default: false.

  3. def setAssertionSourcePrecedence(value: String): AssertionMerger.this.type

    Sets the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized.

    Sets the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized. This parameter contains a comma-separated list of assertion sources that drive the prioritization. Annotations will be prioritized based on the order of the given string.

    Definition Classes
    AssertionPrioritizationParams
  4. def setAssertionsConfidence(value: HashMap[String, Double]): AssertionMerger.this.type

    Sets pairs (assertion,confidenceThreshold) to filter assertions which have confidence lower than the confidence threshold.

  5. def setAssertionsConfidence(value: Map[String, Float]): AssertionMerger.this.type

    Sets pairs (assertion,confidenceThreshold) to filter assertions which have confidence lower than the confidence threshold.

  6. def setBlackList(list: Array[String]): AssertionMerger.this.type

    If defined, list of entities to ignore.

    If defined, list of entities to ignore. The rest will be processed. Should not include IOB prefix on labels. Default: Array()

    Definition Classes
    WhiteAndBlackListParams
  7. def setCaseSensitive(value: Boolean): AssertionMerger.this.type

    Determines whether the definitions of the white listed and black listed entities are case sensitive or not.

    Determines whether the definitions of the white listed and black listed entities are case sensitive or not. Default: true

    Definition Classes
    WhiteAndBlackListParams
  8. def setDefaultConfidence(confidence: Float): AssertionMerger.this.type

    Sets the value to be used when the confidence value is included in the orderingFeatures and a given annotation does not have any confidence.

    Sets the value to be used when the confidence value is included in the orderingFeatures and a given annotation does not have any confidence. The default value is 0f.

    Definition Classes
    AssertionPrioritizationParams
  9. def setInputCols(value: Array[String]): AssertionMerger.this.type

    Set input columns for the Annotator.

    Set input columns for the Annotator.

    Definition Classes
    AssertionMerger → HasInputAnnotationCols
  10. def setMajorityVoting(value: Boolean): AssertionMerger.this.type

    Sets the value of the majorityVoting parameter.

    Sets the value of the majorityVoting parameter. It is used to resolve conflicts when there are more than 2 annotations in the same overlapping group. When confidence is used for ordering features, confidence values sum is used for majority voting.

  11. def setMergeOverlapping(v: Boolean): AssertionMerger.this.type

    Sets whether to merge overlapping matched assertion annotations.

    Sets whether to merge overlapping matched assertion annotations. Default: true

  12. def setOrderingFeatures(values: Array[String]): AssertionMerger.this.type

    Sets the array of strings specifying the ordering features to use for overlapping entities.

    Sets the array of strings specifying the ordering features to use for overlapping entities. Possible values are 'begin', 'end', 'length', 'source', 'confidence'. Default: Array("begin", "length", "source")

    Definition Classes
    AssertionPrioritizationParams
  13. def setSelectionStrategy(strategy: String): AssertionMerger.this.type

    Sets the strategy for selecting annotations.

    Sets the strategy for selecting annotations.

    Annotations can be selected either sequentially based on their order (Sequential) or using a different strategy (DiverseLonger). Currently, only Sequential and DiverseLonger options are available. The default strategy is Sequential.

    Definition Classes
    AssertionPrioritizationParams
  14. def setSortByBegin(value: Boolean): AssertionMerger.this.type

    Sets whether to sort the annotations by begin at the end of the merge and filter process.

    Sets whether to sort the annotations by begin at the end of the merge and filter process. Default: false

  15. def setWhiteList(list: Array[String]): AssertionMerger.this.type

    Sets the list of entities to process.

    Sets the list of entities to process. The rest will be ignored. Should not include IOB prefix on labels. Default: Array()

    Definition Classes
    WhiteAndBlackListParams

Parameter getters

  1. def getApplyFilterBeforeMerge: Boolean

    Gets applyFilterBeforeMerge param.

  2. def getAssertionSourcePrecedence: String

    Gets the value of the assertionSourcePrecedence parameter.

    Gets the value of the assertionSourcePrecedence parameter.

    Definition Classes
    AssertionPrioritizationParams
  3. def getBlackList: Array[String]

    Gets blackList param

    Gets blackList param

    Definition Classes
    WhiteAndBlackListParams
  4. def getCaseSensitive: Boolean

    Gets caseSensitive param

    Gets caseSensitive param

    Definition Classes
    WhiteAndBlackListParams
  5. def getDefaultConfidence: Float

    Gets the value of the defaultConfidence parameter.

    Gets the value of the defaultConfidence parameter.

    Definition Classes
    AssertionPrioritizationParams
  6. def getMajorityVoting: Boolean

    Gets the value of the majorityVoting parameter.

  7. def getMergeOverlapping: Boolean

    Gets mergeOverlapping param.

  8. def getOrderingFeatures: Array[String]

    Gets the value of the orderingFeatures parameter.

    Gets the value of the orderingFeatures parameter.

    Definition Classes
    AssertionPrioritizationParams
  9. def getSelectionStrategy: String

    Gets selectionStrategy param.

    Gets selectionStrategy param.

    Definition Classes
    AssertionPrioritizationParams
  10. def getSortByBegin: Boolean

    Gets sortByBegin param.

  11. def getWhiteList: Array[String]

    Gets whiteList param

    Gets whiteList param

    Definition Classes
    WhiteAndBlackListParams