Class

com.johnsnowlabs.nlp.annotators.resolution

EnsembleEntityResolverApproach

Related Doc: package resolution

Permalink

class EnsembleEntityResolverApproach extends AnnotatorApproach[EnsembleEntityResolverModel] with Licensed with EnsembleApproachClassifierParams with EnsembleModelResolverParams with EnsembleApproachResolverParams with StringFunctions

Trains a model given two Input Annotators of types TOKEN and WORD_EMBEDDINGS, coming from ChunkTokenizer and ChunkEmbeddings Annotators

The returned EnsembleEntityResolverModel consists of two layers: - First a TFIDF + OvrLogRegClassifier on top of the TOKEN Annotations - Second a set of ChunkEntityResolversModels, one per each different class from the first layer

This approach allows Spark NLP's Entity Resolution Architecture to scale to a few millions of rows [codes]

Linear Supertypes
StringFunctions, EnsembleApproachResolverParams, EnsembleModelResolverParams, EnsembleApproachClassifierParams, Licensed, AnnotatorApproach[EnsembleEntityResolverModel], CanBeLazy, DefaultParamsWritable, MLWritable, HasOutputAnnotatorType, HasOutputAnnotationCol, HasInputAnnotationCols, Estimator[EnsembleEntityResolverModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. EnsembleEntityResolverApproach
  2. StringFunctions
  3. EnsembleApproachResolverParams
  4. EnsembleModelResolverParams
  5. EnsembleApproachClassifierParams
  6. Licensed
  7. AnnotatorApproach
  8. CanBeLazy
  9. DefaultParamsWritable
  10. MLWritable
  11. HasOutputAnnotatorType
  12. HasOutputAnnotationCol
  13. HasInputAnnotationCols
  14. Estimator
  15. PipelineStage
  16. Logging
  17. Params
  18. Serializable
  19. Serializable
  20. Identifiable
  21. AnyRef
  22. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new EnsembleEntityResolverApproach()

    Permalink
  2. new EnsembleEntityResolverApproach(uid: String)

    Permalink

Type Members

  1. type AnnotatorType = String

    Permalink
    Definition Classes
    HasOutputAnnotatorType

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. def _fit(dataset: Dataset[_], recursiveStages: Option[PipelineModel]): EnsembleEntityResolverModel

    Permalink
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  6. val alternatives: IntParam

    Permalink

    number of results to return in the metadata after sorting by last distance calculated

    number of results to return in the metadata after sorting by last distance calculated

    Definition Classes
    EnsembleModelResolverParams
  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. def beforeTraining(spark: SparkSession): Unit

    Permalink
    Definition Classes
    AnnotatorApproach
  9. final def checkSchema(schema: StructType, inputAnnotatorType: String): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  10. val classifierLabelCol: Param[String]

    Permalink

    column with the value to predict with the classifier

    column with the value to predict with the classifier

    Definition Classes
    EnsembleApproachClassifierParams
  11. lazy val classifierLabelEncodedCol: String

    Permalink
  12. lazy val classifierLabelPredictedCol: String

    Permalink
  13. lazy val classifierLabelRawCol: String

    Permalink
  14. val classifierLabels: StringArrayParam

    Permalink

    array to output the label in the original form

    array to output the label in the original form

    Definition Classes
    EnsembleApproachClassifierParams
  15. final def clear(param: Param[_]): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    Params
  16. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  17. final def copy(extra: ParamMap): Estimator[EnsembleEntityResolverModel]

    Permalink
    Definition Classes
    AnnotatorApproach → Estimator → PipelineStage → Params
  18. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  19. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  20. val description: String

    Permalink
    Definition Classes
    EnsembleEntityResolverApproach → AnnotatorApproach
  21. val distanceFunction: Param[String]

    Permalink

    what distance function to use for KNN: 'EUCLIDEAN' or 'COSINE'

    what distance function to use for KNN: 'EUCLIDEAN' or 'COSINE'

    Definition Classes
    EnsembleModelResolverParams
  22. val distanceWeights: DoubleArrayParam

    Permalink

    distance weights to apply before pooling: [WMD, TFIDF, Jaccard, SorensenDice, JaroWinkler, Levenshtein]

    distance weights to apply before pooling: [WMD, TFIDF, Jaccard, SorensenDice, JaroWinkler, Levenshtein]

    Definition Classes
    EnsembleModelResolverParams
  23. val enableJaccard: BooleanParam

    Permalink

    whether or not to use Jaccard token distance

    whether or not to use Jaccard token distance

    Definition Classes
    EnsembleModelResolverParams
  24. val enableJaroWinkler: BooleanParam

    Permalink

    whether or not to use Jaro-Winkler character distance

    whether or not to use Jaro-Winkler character distance

    Definition Classes
    EnsembleModelResolverParams
  25. val enableLevenshtein: BooleanParam

    Permalink

    whether or not to use Levenshtein character distance

    whether or not to use Levenshtein character distance

    Definition Classes
    EnsembleModelResolverParams
  26. val enableSorensenDice: BooleanParam

    Permalink

    whether or not to use Sorensen-Dice token distance

    whether or not to use Sorensen-Dice token distance

    Definition Classes
    EnsembleModelResolverParams
  27. val enableTfidf: BooleanParam

    Permalink

    whether or not to use TFIDF token distance

    whether or not to use TFIDF token distance

    Definition Classes
    EnsembleModelResolverParams
  28. val enableWmd: BooleanParam

    Permalink

    whether or not to use WMD token distance

    whether or not to use WMD token distance

    Definition Classes
    EnsembleModelResolverParams
  29. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  30. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  31. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  32. def explainParams(): String

    Permalink
    Definition Classes
    Params
  33. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  34. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  35. val extramassPenalty: DoubleParam

    Permalink

    penalty for extra words in the knowledge base match during WMD calculation

    penalty for extra words in the knowledge base match during WMD calculation

    Definition Classes
    EnsembleModelResolverParams
  36. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  37. final def fit(dataset: Dataset[_]): EnsembleEntityResolverModel

    Permalink
    Definition Classes
    AnnotatorApproach → Estimator
  38. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[EnsembleEntityResolverModel]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  39. def fit(dataset: Dataset[_], paramMap: ParamMap): EnsembleEntityResolverModel

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  40. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): EnsembleEntityResolverModel

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  41. val fitIntercept: Param[Boolean]

    Permalink

    whether to fit an intercept term; default is true

    whether to fit an intercept term; default is true

    Definition Classes
    EnsembleApproachClassifierParams
  42. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  43. def getAlternatives: Int

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  44. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  45. def getClassifierLabelCol: String

    Permalink
  46. def getClassifierLabels: Array[String]

    Permalink
  47. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  48. def getDistanceFunction: String

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  49. def getDistanceWeights: Array[Double]

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  50. def getEnableJaccard: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  51. def getEnableJaroWinkler: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  52. def getEnableLevenshtein: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  53. def getEnableSorensenDice: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  54. def getEnableTfidf: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  55. def getEnableWmd: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  56. def getExtramassPenalty: Double

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  57. def getFitIntercept: Boolean

    Permalink
  58. def getIdfModelPath: String

    Permalink
  59. def getInputCols: Array[String]

    Permalink
    Definition Classes
    HasInputAnnotationCols
  60. def getLazyAnnotator: Boolean

    Permalink
    Definition Classes
    CanBeLazy
  61. def getMaxIter: Int

    Permalink
  62. def getMergeChunks: Boolean

    Permalink
  63. def getMissAsEmpty: Boolean

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  64. def getNeighbours: Int

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  65. def getNormalizedCol: String

    Permalink
  66. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  67. final def getOutputCol: String

    Permalink
    Definition Classes
    HasOutputAnnotationCol
  68. def getOvrModelPath: String

    Permalink
  69. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  70. def getPoolingStrategy: String

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  71. def getResolverLabelCol: String

    Permalink
  72. def getThreshold: Double

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  73. def getTol: Double

    Permalink
  74. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  75. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  76. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  77. lazy val idf: IDF

    Permalink
  78. val idfModelPath: Param[String]

    Permalink

    specify the vectorization model if it has been already trained

    specify the vectorization model if it has been already trained

    Definition Classes
    EnsembleApproachClassifierParams
  79. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  80. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  81. val inputAnnotatorTypes: Array[String]

    Permalink

    inputAnnotatorTypes are TOKEN coming from ChunkTokenizer and WORD_EMBEDDINGS coming from ChunkEmbeddings

    inputAnnotatorTypes are TOKEN coming from ChunkTokenizer and WORD_EMBEDDINGS coming from ChunkEmbeddings

    Definition Classes
    EnsembleEntityResolverApproach → HasInputAnnotationCols
  82. final val inputCols: StringArrayParam

    Permalink
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  83. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  84. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  85. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  86. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  87. def label2path(label: String): String

    Permalink

    Convenience function to use when naming folders after a string that eventually does not comply with filesystem requirements

    Convenience function to use when naming folders after a string that eventually does not comply with filesystem requirements

    label

    string with special characters to transform into lowercase letters and numbers

    returns

    lowercase letters and numbers replacing special characters with _

    Definition Classes
    StringFunctions
  88. val lazyAnnotator: BooleanParam

    Permalink
    Definition Classes
    CanBeLazy
  89. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  90. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  91. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  92. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  93. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  94. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  95. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  96. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  97. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  98. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  99. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  100. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  101. lazy val lr: LogisticRegression

    Permalink
  102. val maxIter: Param[Int]

    Permalink

    maximum number of iterations

    maximum number of iterations

    Definition Classes
    EnsembleApproachClassifierParams
  103. val mergeChunks: BooleanParam

    Permalink

    whether to merge all chunks in a document or not

    whether to merge all chunks in a document or not

    Definition Classes
    EnsembleApproachClassifierParams
  104. val missAsEmpty: BooleanParam

    Permalink

    whether or not to return an empty annotation on unmatched chunks

    whether or not to return an empty annotation on unmatched chunks

    Definition Classes
    EnsembleModelResolverParams
  105. def msgHelper(schema: StructType): String

    Permalink
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  106. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  107. val neighbours: IntParam

    Permalink

    number of neighbours to consider in the KNN query to calculate WMD

    number of neighbours to consider in the KNN query to calculate WMD

    Definition Classes
    EnsembleModelResolverParams
  108. val normalizedCol: Param[String]

    Permalink

    column name for the original, normalized description

    column name for the original, normalized description

    Definition Classes
    EnsembleApproachResolverParams
  109. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  110. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  111. def onTrained(model: EnsembleEntityResolverModel, spark: SparkSession): Unit

    Permalink
    Definition Classes
    AnnotatorApproach
  112. val outputAnnotatorType: AnnotatorType

    Permalink

    oututAnnotatorType is ENTITY

    oututAnnotatorType is ENTITY

    Definition Classes
    EnsembleEntityResolverApproach → HasOutputAnnotatorType
  113. final val outputCol: Param[String]

    Permalink
    Attributes
    protected
    Definition Classes
    HasOutputAnnotationCol
  114. lazy val ovr: OneVsRest

    Permalink
  115. val ovrModelPath: Param[String]

    Permalink

    specify the classification model if it has been already trained

    specify the classification model if it has been already trained

    Definition Classes
    EnsembleApproachClassifierParams
  116. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  117. lazy val partitionCol: String

    Permalink
  118. val poolingStrategy: Param[String]

    Permalink

    pooling strategy to aggregate distances: AVERAGE or SUM

    pooling strategy to aggregate distances: AVERAGE or SUM

    Definition Classes
    EnsembleModelResolverParams
  119. val resolverLabelCol: Param[String]

    Permalink

    column name for the value we are trying to resolve

    column name for the value we are trying to resolve

    Definition Classes
    EnsembleApproachResolverParams
  120. lazy val resolverLabelRawCol: String

    Permalink
  121. lazy val resolverNormalizedRawCol: String

    Permalink
  122. def save(path: String): Unit

    Permalink
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  123. final def set(paramPair: ParamPair[_]): EnsembleEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  124. final def set(param: String, value: Any): EnsembleEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  125. final def set[T](param: Param[T], value: T): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    Params
  126. def setAlternatives(a: Int): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  127. def setClassifierLabelCol(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
  128. def setClassifierLabels(value: Array[String]): EnsembleEntityResolverApproach.this.type

    Permalink
  129. final def setDefault(paramPairs: ParamPair[_]*): EnsembleEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  130. final def setDefault[T](param: Param[T], value: T): EnsembleEntityResolverApproach.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  131. def setDistanceFunction(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  132. def setDistanceWeights(v: Array[Double]): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  133. def setEnableJaccard(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  134. def setEnableJaroWinkler(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  135. def setEnableLevenshtein(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  136. def setEnableSorensenDice(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  137. def setEnableTfidf(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  138. def setEnableWmd(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  139. def setExtramassPenalty(emp: Double): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  140. def setFitIntercept(value: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
  141. def setIdfModelPath(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
  142. final def setInputCols(value: String*): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    HasInputAnnotationCols
  143. final def setInputCols(value: Array[String]): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    HasInputAnnotationCols
  144. def setLazyAnnotator(value: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    CanBeLazy
  145. def setMaxIter(value: Int): EnsembleEntityResolverApproach.this.type

    Permalink
  146. def setMergeChunks(value: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
  147. def setMissAsEmpty(v: Boolean): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  148. def setNeighbours(k: Int): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  149. def setNormalizedCol(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
  150. final def setOutputCol(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    HasOutputAnnotationCol
  151. def setOvrModelPath(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
  152. def setPoolingStrategy(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  153. def setResolverLabelCol(value: String): EnsembleEntityResolverApproach.this.type

    Permalink
  154. def setThreshold(dist: Double): EnsembleEntityResolverApproach.this.type

    Permalink
    Definition Classes
    EnsembleModelResolverParams
  155. def setTol(value: Double): EnsembleEntityResolverApproach.this.type

    Permalink
  156. lazy val sidx: StringIndexer

    Permalink
  157. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  158. lazy val tf: HashingTF

    Permalink
  159. lazy val tfCol: String

    Permalink
  160. lazy val tfidfCol: String

    Permalink
  161. val threshold: DoubleParam

    Permalink

    threshold value for the aggregated distance

    threshold value for the aggregated distance

    Definition Classes
    EnsembleModelResolverParams
  162. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  163. lazy val tokenAnnotationCol: String

    Permalink
  164. lazy val tokenRawCol: String

    Permalink
  165. val tol: Param[Double]

    Permalink

    convergence tolerance after each iteration

    convergence tolerance after each iteration

    Definition Classes
    EnsembleApproachClassifierParams
  166. def train(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): EnsembleEntityResolverModel

    Permalink

    Returns the EnsembleEntityResolverModel Transformer, that can be used to transform input datasets

    Returns the EnsembleEntityResolverModel Transformer, that can be used to transform input datasets

    The dataset provided to the fit method should have one chunk per row and contain the following columns: ChunkTokens, ChunkEmbeddings, ClassifierLabel, ResolverLabel, [ResolverNormalized]

    The cardinality of each ClassifierLabel should not exceed 100.000 ResolverLabels since searching in such a big KD-tree becomes impractical

    This method is called inside the AnnotatorApproach's fit method

    dataset

    a Dataset containing ChunkTokens, ChunkEmbeddings, ClassifierLabel, ResolverLabel, [ResolverNormalized]

    returns

    a trained EnsembleEntityResolverModel

    Definition Classes
    EnsembleEntityResolverApproach → AnnotatorApproach
  167. final def transformSchema(schema: StructType): StructType

    Permalink
    Definition Classes
    AnnotatorApproach → PipelineStage
  168. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  169. val uid: String

    Permalink
    Definition Classes
    EnsembleEntityResolverApproach → Identifiable
  170. def validate(schema: StructType): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  171. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  172. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  173. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  174. def write: MLWriter

    Permalink
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from StringFunctions

Inherited from Licensed

Inherited from AnnotatorApproach[EnsembleEntityResolverModel]

Inherited from CanBeLazy

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from HasOutputAnnotatorType

Inherited from HasOutputAnnotationCol

Inherited from HasInputAnnotationCols

Inherited from Estimator[EnsembleEntityResolverModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped