Class

com.johnsnowlabs.nlp.annotators.resolution

ChunkEntityResolverApproach

Related Doc: package resolution

Permalink

class ChunkEntityResolverApproach extends AnnotatorApproach[ChunkEntityResolverModel] with Licensed

Linear Supertypes
Licensed, AnnotatorApproach[ChunkEntityResolverModel], CanBeLazy, DefaultParamsWritable, MLWritable, HasOutputAnnotatorType, HasOutputAnnotationCol, HasInputAnnotationCols, Estimator[ChunkEntityResolverModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ChunkEntityResolverApproach
  2. Licensed
  3. AnnotatorApproach
  4. CanBeLazy
  5. DefaultParamsWritable
  6. MLWritable
  7. HasOutputAnnotatorType
  8. HasOutputAnnotationCol
  9. HasInputAnnotationCols
  10. Estimator
  11. PipelineStage
  12. Logging
  13. Params
  14. Serializable
  15. Serializable
  16. Identifiable
  17. AnyRef
  18. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ChunkEntityResolverApproach()

    Permalink
  2. new ChunkEntityResolverApproach(uid: String)

    Permalink

Type Members

  1. type AnnotatorType = String

    Permalink
    Definition Classes
    HasOutputAnnotatorType

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. def _fit(dataset: Dataset[_], recursiveStages: Option[PipelineModel]): ChunkEntityResolverModel

    Permalink
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  6. val alternatives: IntParam

    Permalink

    number of results to return in the metadata after sorting by last distance calculated

  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. def beforeTraining(spark: SparkSession): Unit

    Permalink
    Definition Classes
    AnnotatorApproach
  9. final def checkSchema(schema: StructType, inputAnnotatorType: String): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  10. final def clear(param: Param[_]): ChunkEntityResolverApproach.this.type

    Permalink
    Definition Classes
    Params
  11. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. final def copy(extra: ParamMap): Estimator[ChunkEntityResolverModel]

    Permalink
    Definition Classes
    AnnotatorApproach → Estimator → PipelineStage → Params
  13. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  14. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  15. val description: String

    Permalink
    Definition Classes
    ChunkEntityResolverApproach → AnnotatorApproach
  16. val distanceFunction: Param[String]

    Permalink

    what distance function to use for KNN: 'EUCLIDEAN' or 'COSINE'

  17. val distanceWeights: DoubleArrayParam

    Permalink

    distance weights to apply during pooling: [WMD, TFIDF, Jaccard, SorensenDice, JaroWinkler, Levenshtein]

  18. lazy val embeddingsColumnName: String

    Permalink
  19. val enableJaccard: BooleanParam

    Permalink

    whether or not to use Jaccard token distance

  20. val enableJaroWinkler: BooleanParam

    Permalink

    whether or not to use Jaro-Winkler character distance

  21. val enableLevenshtein: BooleanParam

    Permalink

    whether or not to use Levenshtein character distance

  22. val enableSorensenDice: BooleanParam

    Permalink

    whether or not to use Sorensen-Dice token distance

  23. val enableTfidf: BooleanParam

    Permalink

    whether or not to use TFIDF token distance

  24. val enableWmd: BooleanParam

    Permalink

    whether or not to use WMD token distance

  25. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  26. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  27. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  28. def explainParams(): String

    Permalink
    Definition Classes
    Params
  29. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  30. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  31. val extramassPenalty: DoubleParam

    Permalink

    penalty for extra words in the knowledge base match during WMD calculation

  32. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  33. final def fit(dataset: Dataset[_]): ChunkEntityResolverModel

    Permalink
    Definition Classes
    AnnotatorApproach → Estimator
  34. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[ChunkEntityResolverModel]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  35. def fit(dataset: Dataset[_], paramMap: ParamMap): ChunkEntityResolverModel

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  36. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): ChunkEntityResolverModel

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  37. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  38. def getAlternatives: Int

    Permalink
  39. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  40. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  41. def getDistanceFunction: String

    Permalink
  42. def getDistanceWeights: Array[Double]

    Permalink
  43. def getEnableJaccard: Boolean

    Permalink
  44. def getEnableJaroWinkler: Boolean

    Permalink
  45. def getEnableLevenshtein: Boolean

    Permalink
  46. def getEnableSorensenDice: Boolean

    Permalink
  47. def getEnableTfidf: Boolean

    Permalink
  48. def getEnableWmd: Boolean

    Permalink
  49. def getExtramassPenalty: Double

    Permalink
  50. def getInputCols: Array[String]

    Permalink
    Definition Classes
    HasInputAnnotationCols
  51. def getLabelCol: String

    Permalink
  52. def getLazyAnnotator: Boolean

    Permalink
    Definition Classes
    CanBeLazy
  53. def getMissAsEmpty: Boolean

    Permalink
  54. def getNeighbours: Int

    Permalink
  55. def getNormalizedCol: String

    Permalink
  56. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  57. final def getOutputCol: String

    Permalink
    Definition Classes
    HasOutputAnnotationCol
  58. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  59. def getPoolingStrategy: String

    Permalink
  60. def getThreshold: Double

    Permalink
  61. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  62. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  63. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  64. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  65. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  66. val inputAnnotatorTypes: Array[String]

    Permalink
    Definition Classes
    ChunkEntityResolverApproach → HasInputAnnotationCols
  67. final val inputCols: StringArrayParam

    Permalink
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  68. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  69. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  70. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  71. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  72. val labelCol: Param[String]

    Permalink

    column name for the value we are trying to resolve

  73. lazy val labelColumnName: String

    Permalink
  74. val lazyAnnotator: BooleanParam

    Permalink
    Definition Classes
    CanBeLazy
  75. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  76. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  77. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  78. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  79. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  80. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  81. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  82. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  83. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  84. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  85. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  86. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  87. val missAsEmpty: BooleanParam

    Permalink

    whether or not to return an empty annotation on unmatched chunks

  88. def msgHelper(schema: StructType): String

    Permalink
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  89. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  90. val neighbours: IntParam

    Permalink

    number of neighbours to consider in the KNN query to calculate WMD

  91. val normalizedCol: Param[String]

    Permalink

    column name for the original, normalized description

  92. lazy val normalizedColumnName: String

    Permalink
  93. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  94. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  95. def onTrained(model: ChunkEntityResolverModel, spark: SparkSession): Unit

    Permalink
    Definition Classes
    AnnotatorApproach
  96. val outputAnnotatorType: AnnotatorType

    Permalink
    Definition Classes
    ChunkEntityResolverApproach → HasOutputAnnotatorType
  97. final val outputCol: Param[String]

    Permalink
    Attributes
    protected
    Definition Classes
    HasOutputAnnotationCol
  98. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  99. val poolingStrategy: Param[String]

    Permalink

    pooling strategy to aggregate distances: AVERAGE or SUM

  100. def save(path: String): Unit

    Permalink
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  101. final def set(paramPair: ParamPair[_]): ChunkEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  102. final def set(param: String, value: Any): ChunkEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  103. final def set[T](param: Param[T], value: T): ChunkEntityResolverApproach.this.type

    Permalink
    Definition Classes
    Params
  104. def setAlternatives(a: Int): ChunkEntityResolverApproach.this.type

    Permalink
  105. final def setDefault(paramPairs: ParamPair[_]*): ChunkEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  106. final def setDefault[T](param: Param[T], value: T): ChunkEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  107. def setDistanceFunction(value: String): ChunkEntityResolverApproach.this.type

    Permalink
  108. def setDistanceWeights(v: Array[Double]): ChunkEntityResolverApproach.this.type

    Permalink
  109. def setEnableJaccard(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  110. def setEnableJaroWinkler(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  111. def setEnableLevenshtein(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  112. def setEnableSorensenDice(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  113. def setEnableTfidf(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  114. def setEnableWmd(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  115. def setExtramassPenalty(emp: Double): ChunkEntityResolverApproach.this.type

    Permalink
  116. final def setInputCols(value: String*): ChunkEntityResolverApproach.this.type

    Permalink
    Definition Classes
    HasInputAnnotationCols
  117. final def setInputCols(value: Array[String]): ChunkEntityResolverApproach.this.type

    Permalink
    Definition Classes
    HasInputAnnotationCols
  118. def setLabelCol(value: String): ChunkEntityResolverApproach.this.type

    Permalink
  119. def setLazyAnnotator(value: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
    Definition Classes
    CanBeLazy
  120. def setMissAsEmpty(v: Boolean): ChunkEntityResolverApproach.this.type

    Permalink
  121. def setNeighbours(k: Int): ChunkEntityResolverApproach.this.type

    Permalink
  122. def setNormalizedCol(value: String): ChunkEntityResolverApproach.this.type

    Permalink
  123. final def setOutputCol(value: String): ChunkEntityResolverApproach.this.type

    Permalink
    Definition Classes
    HasOutputAnnotationCol
  124. def setPoolingStrategy(value: String): ChunkEntityResolverApproach.this.type

    Permalink
  125. def setThreshold(dist: Double): ChunkEntityResolverApproach.this.type

    Permalink
  126. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  127. val threshold: DoubleParam

    Permalink

    threshold value for the aggregated distance

  128. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  129. lazy val tokensColumnName: String

    Permalink
  130. def train(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): ChunkEntityResolverModel

    Permalink

    Returns the ChunkEntityResolverModel Transformer, that can be used to transform input datasets

    Returns the ChunkEntityResolverModel Transformer, that can be used to transform input datasets

    The dataset provided to the fit method should have one chunk per row and contain the following columns: ChunkTokens, ChunkEmbeddings, ResolverLabel, [ResolverNormalized]

    The cardinality of the dataset should not exceed 100.000 data points since searching in such a big KD-tree becomes impractical

    This method is called inside the AnnotatorApproach's fit method

    dataset

    a Dataset containing ChunkTokens, ChunkEmbeddings, ClassifierLabel, ResolverLabel, [ResolverNormalized]

    returns

    a trained ChunkEntityResolverModel

    Definition Classes
    ChunkEntityResolverApproach → AnnotatorApproach
  131. final def transformSchema(schema: StructType): StructType

    Permalink
    Definition Classes
    AnnotatorApproach → PipelineStage
  132. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  133. val uid: String

    Permalink
    Definition Classes
    ChunkEntityResolverApproach → Identifiable
  134. def validate(schema: StructType): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  135. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  136. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  137. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  138. def write: MLWriter

    Permalink
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from Licensed

Inherited from AnnotatorApproach[ChunkEntityResolverModel]

Inherited from CanBeLazy

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from HasOutputAnnotatorType

Inherited from HasOutputAnnotationCol

Inherited from HasInputAnnotationCols

Inherited from Estimator[ChunkEntityResolverModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped