com.johnsnowlabs.nlp.annotators.assertion.merger
AssertionMerger
Companion object AssertionMerger
class AssertionMerger extends AnnotatorModel[AssertionMerger] with HasSimpleAnnotate[AssertionMerger] with WhiteAndBlackListParams with AssertionPrioritizationParams
Merges variety assertion columns coming from Assertion annotators like com.johnsnowlabs.nlp.annotators.assertion.dl.AssertionDLModel.
- See also
com.johnsnowlabs.nlp.annotators.assertion.dl.AssertionDLModel AssertionMerger can filter, prioritize and merge assertion annotations by using proper parameters.
WhiteAndBlackListParams
Example
val document_assembler = new DocumentAssembler() .setInputCol("text").setOutputCol("document") val sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl_healthcare", "en", "clinical/models") .setInputCols(Array("document")).setOutputCol("sentence") val tokenizer = new Tokenizer() .setInputCols(Array("sentence")).setOutputCol("token") val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models") .setInputCols(Array("sentence", "token")).setOutputCol("embeddings") val ner_model = MedicalNerModel.pretrained("ner_opioid", "en", "clinical/models") .setInputCols(Array("sentence", "token", "embeddings")).setOutputCol("ner") val ner_converter = new NerConverterInternal() .setInputCols(Array("sentence", "token", "ner")).setOutputCol("ner_chunk") .setWhiteList(Array("opioid_drug", "other_drug")) val assertion = AssertionDLModel.pretrained("assertion_opioid_drug_status_wip", "en", "clinical/models") .setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion") val assertion2 = AssertionDLModel.pretrained("assertion_opioid_wip", "en", "clinical/models") .setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion2") val assertion_merger = new AssertionMerger() .setInputCols("assertion", "assertion2") .setOutputCol("assertion_merger") .setMergeOverlapping(true) .setSelectionStrategy("Sequential") .setAssertionSourcePrecedence("assertion2,assertion") .setBlackList(Array("HYPothetical")) .setCaseSensitive(false) .setAssertionsConfidence(Map("history" -> 0.80f)) .setOrderingFeatures(Array("length", "source", "confidence")) val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, word_embeddings, ner_model, ner_converter, assertion, assertion2, assertion_merger)) val data = Seq("""The patient presented to the hospital for a neurological evaluation, with a documented prescription for Percocet to manage chronic back pain. Assessment revealed ongoing discomfort localized to the lumbar region, with associated numbness and tingling in the lower extremities.""", """The patient, with a known history of hypertension managed with atenolol 50mg and verapamil 40mg, presented after a fall resulting in an ankle injury. Examination revealed swelling and tenderness, indicative of a twisted ankle. Considering the patient's medical history and pain management needs, a prescription for tramadol was provided to alleviate discomfort while ensuring minimal impact on blood pressure control.""", """The patient presented to the rehabilitation facility with a documented history of opioid abuse, primarily stemming from misuse of prescription percocet pills intended for their partner's use. Initial assessment revealed withdrawal symptoms consistent with opioid dependency, including agitation, diaphoresis, and myalgias.""", """The patient presented to the emergency department following an overdose on cocaine. On examination, the patient displayed signs of sympathetic nervous system stimulation, including tachycardia, hypertension, dilated pupils, and agitation.""", """The patient, with a documented history of chronic pain syndrome, was admitted following an accidental overdose of prescribed OxyContin. Upon assessment, the patient displayed symptoms indicative of opioid toxicity, including respiratory depression, altered mental status, and pinpoint pupils. Immediate resuscitative measures were undertaken, including airway management, administration of naloxone, and close monitoring of vital signs.""") .toDF("text")
Show results
val resultDF = pipeline.fit(data).transform(data) resultDF.selectExpr("explode(assertion_merger) as merger").show(false) +---------------------------------------------------------------------------------------------------------------------+ |merger | +---------------------------------------------------------------------------------------------------------------------+ |{assertion, 104, 111, present, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.9802}, []}| |{assertion, 63, 70, history, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.8833}, []} | |{assertion, 143, 150, present, {sentence -> 0, chunk -> 1, assertion_source -> assertion2, confidence -> 0.905}, []} | |{assertion, 256, 261, present, {sentence -> 1, chunk -> 2, assertion_source -> assertion2, confidence -> 0.5283}, []}| |{assertion, 75, 81, present, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.6853}, []} | |{assertion, 125, 133, present, {sentence -> 0, chunk -> 0, assertion_source -> assertion2, confidence -> 0.5923}, []}| |{assertion, 198, 203, present, {sentence -> 1, chunk -> 1, assertion_source -> assertion2, confidence -> 0.8479}, []}| +---------------------------------------------------------------------------------------------------------------------+
- Grouped
- Alphabetic
- By Inheritance
- AssertionMerger
- AssertionPrioritizationParams
- WhiteAndBlackListParams
- HasSimpleAnnotate
- AnnotatorModel
- CanBeLazy
- RawAnnotator
- HasOutputAnnotationCol
- HasInputAnnotationCols
- HasOutputAnnotatorType
- ParamsAndFeaturesWritable
- HasFeatures
- DefaultParamsWritable
- MLWritable
- Model
- Transformer
- PipelineStage
- Logging
- Params
- Serializable
- Serializable
- Identifiable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
Type Members
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
$[T](param: Param[T]): T
- Attributes
- protected
- Definition Classes
- Params
-
def
$$[T](feature: StructFeature[T]): T
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[K, V](feature: MapFeature[K, V]): Map[K, V]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[T](feature: SetFeature[T]): Set[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[T](feature: ArrayFeature[T]): Array[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
_transform(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): DataFrame
- Attributes
- protected
- Definition Classes
- AnnotatorModel
-
def
afterAnnotate(dataset: DataFrame): DataFrame
- Attributes
- protected
- Definition Classes
- AnnotatorModel
-
def
annotate(annotations: Seq[Annotation]): Seq[Annotation]
- annotations
The annotations per row that we need to merge and filter. Annotations should be ASSERTION type.
- returns
The merged and filtered annotations of ASSERTION.
- Definition Classes
- AssertionMerger → HasSimpleAnnotate
-
val
applyFilterBeforeMerge: BooleanParam
Whether to apply filtering before merging process.
Whether to apply filtering before merging process. If true, filtering will be applied before merging; if false, filtering will be applied after merging process. Default: false.
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
val
assertionSourcePrecedence: Param[String]
Specifies the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized.
Specifies the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized. This parameter contains a comma-separated list of assertion sources that drive the prioritization. Annotations will be prioritized based on the order of the given string.
- Definition Classes
- AssertionPrioritizationParams
-
val
assertionsConfidence: MapFeature[String, Float]
Pairs (assertion,confidenceThreshold) to filter assertions which have confidence lower than the confidence threshold.
- lazy val assertionsConfidenceMap: Map[String, Float]
-
def
beforeAnnotate(dataset: Dataset[_]): Dataset[_]
- Attributes
- protected
- Definition Classes
- AnnotatorModel
-
val
blackList: StringArrayParam
If defined, list of entities to ignore.
If defined, list of entities to ignore. The rest will be processed. Should not include IOB prefix on labels. Default:
Array()
- Definition Classes
- WhiteAndBlackListParams
-
val
caseSensitive: BooleanParam
Determines whether the definitions of the white listed and black listed entities are case sensitive or not.
Determines whether the definitions of the white listed and black listed entities are case sensitive or not. Default: true
- Definition Classes
- WhiteAndBlackListParams
-
final
def
checkSchema(schema: StructType, inputAnnotatorType: String): Boolean
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
clear(param: Param[_]): AssertionMerger.this.type
- Definition Classes
- Params
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
copy(extra: ParamMap): AssertionMerger
- Definition Classes
- RawAnnotator → Model → Transformer → PipelineStage → Params
-
def
copyValues[T <: Params](to: T, extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
val
defaultConfidence: FloatParam
When the confidence value is included in the orderingFeatures and a given annotation does not have any confidence, this parameter determines the value to be used.
When the confidence value is included in the orderingFeatures and a given annotation does not have any confidence, this parameter determines the value to be used. The default value is 0f.
- Definition Classes
- AssertionPrioritizationParams
-
final
def
defaultCopy[T <: Params](extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
def
dfAnnotate: UserDefinedFunction
- Definition Classes
- HasSimpleAnnotate
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
evaluateFilter(filter: String): Boolean
Filter annotations by blackList and whiteList, taking into account the caseSensitive param.
Filter annotations by blackList and whiteList, taking into account the caseSensitive param.
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
def
explainParam(param: Param[_]): String
- Definition Classes
- Params
-
def
explainParams(): String
- Definition Classes
- Params
-
def
extraValidate(structType: StructType): Boolean
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
def
extraValidateMsg: String
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
final
def
extractParamMap(): ParamMap
- Definition Classes
- Params
-
final
def
extractParamMap(extra: ParamMap): ParamMap
- Definition Classes
- Params
-
val
features: ArrayBuffer[Feature[_, _, _]]
- Definition Classes
- HasFeatures
-
def
filterByEntityField(annotation: Annotation): Boolean
Filter annotation by blackList and whiteList, taking into account the caseSensitive param.
Filter annotation by blackList and whiteList, taking into account the caseSensitive param. It filters by annotation.metadata.getOrElse("entity", annotation.metadata.getOrElse("identifier", "")).toString
- returns
Boolean
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
def
filterByEntityField(annotations: Seq[Annotation]): Seq[Annotation]
Filter annotations by blackList and whiteList, taking into account the caseSensitive param.
Filter annotations by blackList and whiteList, taking into account the caseSensitive param. It filters by annotation.metadata.getOrElse("entity", annotation.metadata.getOrElse("identifier", "")).toString
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
def
filterByWhiteAndBlackList(annotation: Annotation): Boolean
Filter annotation by blackList and whiteList, taking into account the caseSensitive param.
Filter annotation by blackList and whiteList, taking into account the caseSensitive param. It filters by annotation.result
- returns
Boolean
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
def
filterByWhiteAndBlackList(annotations: Seq[Annotation]): Seq[Annotation]
Filter annotations by blackList and whiteList, taking into account the caseSensitive param.
Filter annotations by blackList and whiteList, taking into account the caseSensitive param. It filters by annotation.result
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
get[T](feature: StructFeature[T]): Option[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[T](feature: SetFeature[T]): Option[Set[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[T](feature: ArrayFeature[T]): Option[Array[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
get[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
def
getApplyFilterBeforeMerge: Boolean
Gets applyFilterBeforeMerge param.
-
def
getAssertionSourcePrecedence: String
Gets the value of the assertionSourcePrecedence parameter.
Gets the value of the assertionSourcePrecedence parameter.
- Definition Classes
- AssertionPrioritizationParams
-
def
getBlackList: Array[String]
Gets blackList param
Gets blackList param
- Definition Classes
- WhiteAndBlackListParams
-
def
getCaseSensitive: Boolean
Gets caseSensitive param
Gets caseSensitive param
- Definition Classes
- WhiteAndBlackListParams
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
getDefault[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
def
getDefaultConfidence: Float
Gets the value of the defaultConfidence parameter.
Gets the value of the defaultConfidence parameter.
- Definition Classes
- AssertionPrioritizationParams
-
def
getInputCols: Array[String]
- Definition Classes
- HasInputAnnotationCols
-
def
getLazyAnnotator: Boolean
- Definition Classes
- CanBeLazy
-
def
getMajorityVoting: Boolean
Gets the value of the majorityVoting parameter.
-
def
getMergeOverlapping: Boolean
Gets mergeOverlapping param.
-
final
def
getOrDefault[T](param: Param[T]): T
- Definition Classes
- Params
-
def
getOrderingFeatures: Array[String]
Gets the value of the orderingFeatures parameter.
Gets the value of the orderingFeatures parameter.
- Definition Classes
- AssertionPrioritizationParams
-
final
def
getOutputCol: String
- Definition Classes
- HasOutputAnnotationCol
-
def
getParam(paramName: String): Param[Any]
- Definition Classes
- Params
-
def
getSelectionStrategy: String
Gets selectionStrategy param.
Gets selectionStrategy param.
- Definition Classes
- AssertionPrioritizationParams
-
def
getSortByBegin: Boolean
Gets sortByBegin param.
-
def
getWhiteList: Array[String]
Gets whiteList param
Gets whiteList param
- Definition Classes
- WhiteAndBlackListParams
-
final
def
hasDefault[T](param: Param[T]): Boolean
- Definition Classes
- Params
-
def
hasParam(paramName: String): Boolean
- Definition Classes
- Params
-
def
hasParent: Boolean
- Definition Classes
- Model
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
val
inputAnnotatorTypes: Array[String]
- Definition Classes
- AssertionMerger → HasInputAnnotationCols
-
final
val
inputCols: StringArrayParam
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
isDefined(param: Param[_]): Boolean
- Definition Classes
- Params
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
isSet(param: Param[_]): Boolean
- Definition Classes
- Params
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
isValueInList(value: String, list: Array[String]): Boolean
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
def
isWhiteListAndBlacklistEmpty: Boolean
- Attributes
- protected
- Definition Classes
- WhiteAndBlackListParams
-
val
lazyAnnotator: BooleanParam
- Definition Classes
- CanBeLazy
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
val
majorityVoting: BooleanParam
Whether to use majority voting to resolve conflicts.
Whether to use majority voting to resolve conflicts. Default is false. It is used to resolve conflicts when there are more than 2 annotations in the same overlapping group. When confidence is used for ordering features, confidence values sum is used for majority voting.
-
val
mergeOverlapping: BooleanParam
Whether to merge overlapping matched assertion annotations.
Whether to merge overlapping matched assertion annotations. Default: true
-
def
msgHelper(schema: StructType): String
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
onWrite(path: String, spark: SparkSession): Unit
- Attributes
- protected
- Definition Classes
- ParamsAndFeaturesWritable
-
val
optionalInputAnnotatorTypes: Array[String]
- Definition Classes
- HasInputAnnotationCols
-
val
orderingFeatures: StringArrayParam
Specifies the ordering features to use for overlapping entities.
Specifies the ordering features to use for overlapping entities. Possible values include: 'begin', 'end', 'length', 'source', 'confidence'. Default: Array("begin", "length", "source")
- Definition Classes
- AssertionPrioritizationParams
-
val
outputAnnotatorType: AnnotatorType
- Definition Classes
- AssertionMerger → HasOutputAnnotatorType
-
final
val
outputCol: Param[String]
- Attributes
- protected
- Definition Classes
- HasOutputAnnotationCol
-
lazy val
params: Array[Param[_]]
- Definition Classes
- Params
-
var
parent: Estimator[AssertionMerger]
- Definition Classes
- Model
-
def
prioritize(annotations: Seq[Annotation]): Seq[Annotation]
- Attributes
- protected
- Definition Classes
- AssertionPrioritizationParams
-
def
save(path: String): Unit
- Definition Classes
- MLWritable
- Annotations
- @Since( "1.6.0" ) @throws( ... )
-
val
selectionStrategy: Param[String]
Determines the strategy for selecting annotations.
Determines the strategy for selecting annotations.
Annotations can be selected either sequentially based on their order (Sequential) or using a more diverse strategy (DiverseLonger). Currently, only Sequential and DiverseLonger options are available. The default strategy is Sequential.
- Definition Classes
- AssertionPrioritizationParams
-
def
set[T](feature: StructFeature[T], value: T): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[K, V](feature: MapFeature[K, V], value: Map[K, V]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[T](feature: SetFeature[T], value: Set[T]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[T](feature: ArrayFeature[T], value: Array[T]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
set(paramPair: ParamPair[_]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set(param: String, value: Any): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set[T](param: Param[T], value: T): AssertionMerger.this.type
- Definition Classes
- Params
-
def
setAllowList(list: String*): AssertionMerger.this.type
- Definition Classes
- WhiteAndBlackListParams
-
def
setAllowList(list: Array[String]): AssertionMerger.this.type
- Definition Classes
- WhiteAndBlackListParams
-
def
setApplyFilterBeforeMerge(value: Boolean): AssertionMerger.this.type
Sets whether to apply filtering before merging process.
Sets whether to apply filtering before merging process. If true, filtering will be applied before merging; if false, filtering will be applied after merging process. Default: false.
-
def
setAssertionSourcePrecedence(value: String): AssertionMerger.this.type
Sets the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized.
Sets the assertion sources to use for prioritizing overlapping annotations when the 'source' ordering feature is utilized. This parameter contains a comma-separated list of assertion sources that drive the prioritization. Annotations will be prioritized based on the order of the given string.
- Definition Classes
- AssertionPrioritizationParams
-
def
setAssertionsConfidence(value: HashMap[String, Double]): AssertionMerger.this.type
Sets pairs (assertion,confidenceThreshold) to filter assertions which have confidence lower than the confidence threshold.
-
def
setAssertionsConfidence(value: Map[String, Float]): AssertionMerger.this.type
Sets pairs (assertion,confidenceThreshold) to filter assertions which have confidence lower than the confidence threshold.
-
def
setBlackList(list: String*): AssertionMerger.this.type
- Definition Classes
- WhiteAndBlackListParams
-
def
setBlackList(list: Array[String]): AssertionMerger.this.type
If defined, list of entities to ignore.
If defined, list of entities to ignore. The rest will be processed. Should not include IOB prefix on labels. Default:
Array()
- Definition Classes
- WhiteAndBlackListParams
-
def
setCaseSensitive(value: Boolean): AssertionMerger.this.type
Determines whether the definitions of the white listed and black listed entities are case sensitive or not.
Determines whether the definitions of the white listed and black listed entities are case sensitive or not. Default: true
- Definition Classes
- WhiteAndBlackListParams
-
def
setDefault[T](feature: StructFeature[T], value: () ⇒ T): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
setDefault(paramPairs: ParamPair[_]*): AssertionMerger.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
setDefault[T](param: Param[T], value: T): AssertionMerger.this.type
- Attributes
- protected[org.apache.spark.ml]
- Definition Classes
- Params
-
def
setDefaultConfidence(confidence: Float): AssertionMerger.this.type
Sets the value to be used when the confidence value is included in the orderingFeatures and a given annotation does not have any confidence.
Sets the value to be used when the confidence value is included in the orderingFeatures and a given annotation does not have any confidence. The default value is 0f.
- Definition Classes
- AssertionPrioritizationParams
-
def
setDenyList(list: String*): AssertionMerger.this.type
- Definition Classes
- WhiteAndBlackListParams
-
def
setDenyList(list: Array[String]): AssertionMerger.this.type
- Definition Classes
- WhiteAndBlackListParams
-
def
setInputCols(value: Array[String]): AssertionMerger.this.type
Set input columns for the Annotator.
Set input columns for the Annotator.
- Definition Classes
- AssertionMerger → HasInputAnnotationCols
-
final
def
setInputCols(value: String*): AssertionMerger.this.type
- Definition Classes
- HasInputAnnotationCols
-
def
setLazyAnnotator(value: Boolean): AssertionMerger.this.type
- Definition Classes
- CanBeLazy
-
def
setMajorityVoting(value: Boolean): AssertionMerger.this.type
Sets the value of the majorityVoting parameter.
Sets the value of the majorityVoting parameter. It is used to resolve conflicts when there are more than 2 annotations in the same overlapping group. When confidence is used for ordering features, confidence values sum is used for majority voting.
-
def
setMergeOverlapping(v: Boolean): AssertionMerger.this.type
Sets whether to merge overlapping matched assertion annotations.
Sets whether to merge overlapping matched assertion annotations. Default: true
-
def
setOrderingFeatures(values: Array[String]): AssertionMerger.this.type
Sets the array of strings specifying the ordering features to use for overlapping entities.
Sets the array of strings specifying the ordering features to use for overlapping entities. Possible values are 'begin', 'end', 'length', 'source', 'confidence'. Default: Array("begin", "length", "source")
- Definition Classes
- AssertionPrioritizationParams
-
final
def
setOutputCol(value: String): AssertionMerger.this.type
- Definition Classes
- HasOutputAnnotationCol
-
def
setParent(parent: Estimator[AssertionMerger]): AssertionMerger
- Definition Classes
- Model
-
def
setSelectionStrategy(strategy: String): AssertionMerger.this.type
Sets the strategy for selecting annotations.
Sets the strategy for selecting annotations.
Annotations can be selected either sequentially based on their order (Sequential) or using a different strategy (DiverseLonger). Currently, only Sequential and DiverseLonger options are available. The default strategy is Sequential.
- Definition Classes
- AssertionPrioritizationParams
-
def
setSortByBegin(value: Boolean): AssertionMerger.this.type
Sets whether to sort the annotations by begin at the end of the merge and filter process.
Sets whether to sort the annotations by begin at the end of the merge and filter process. Default: false
-
def
setWhiteList(list: String*): AssertionMerger.this.type
- Definition Classes
- WhiteAndBlackListParams
-
def
setWhiteList(list: Array[String]): AssertionMerger.this.type
Sets the list of entities to process.
Sets the list of entities to process. The rest will be ignored. Should not include IOB prefix on labels. Default:
Array()
- Definition Classes
- WhiteAndBlackListParams
-
val
sortByBegin: BooleanParam
Whether to sort the annotations by begin at the end of the merge and filter process.
Whether to sort the annotations by begin at the end of the merge and filter process. Default: false
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- Identifiable → AnyRef → Any
-
final
def
transform(dataset: Dataset[_]): DataFrame
- Definition Classes
- AnnotatorModel → Transformer
-
def
transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
- Definition Classes
- Transformer
- Annotations
- @Since( "2.0.0" )
-
def
transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
- Definition Classes
- Transformer
- Annotations
- @Since( "2.0.0" ) @varargs()
-
final
def
transformSchema(schema: StructType): StructType
- Definition Classes
- RawAnnotator → PipelineStage
-
def
transformSchema(schema: StructType, logging: Boolean): StructType
- Attributes
- protected
- Definition Classes
- PipelineStage
- Annotations
- @DeveloperApi()
-
val
uid: String
- Definition Classes
- AssertionMerger → Identifiable
-
def
validate(schema: StructType): Boolean
- Attributes
- protected
- Definition Classes
- AssertionMerger → RawAnnotator
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
val
whiteList: StringArrayParam
If defined, list of entities to process.
If defined, list of entities to process. The rest will be ignored. Should not include IOB prefix on labels. Default:
Array()
- Definition Classes
- WhiteAndBlackListParams
-
def
wrapColumnMetadata(col: Column): Column
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
def
write: MLWriter
- Definition Classes
- ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable