com.johnsnowlabs.nlp.annotators.embeddings

EntityChunkEmbeddings

Companion object EntityChunkEmbeddings

class EntityChunkEmbeddings extends BertSentenceEmbeddings with CheckLicense

Entity Chunk Embeddings uses BERT Sentence embeddings to compute a weighted average vector represention of related entity chunks. The input the model consists of chunks of recognized named entities. One or more entities are selected as target entities and for each of them a list of related entities is specified (if empty, all other entities are assumed to be related). The model looks for chunks of the target entities and then tries to pair each target entity (e.g. DRUG) with other related entities (e.g. DOSAGE, STRENGTH, FORM, etc). The criterion for pairing a target entity with another related entity is that they appear in the same sentence and the maximal syntactic distance is below a predefined threshold. The relationship between target and related entities is one-to-many, meaning that if there multiple instances of the same target entity (e.g.) within a sentence, the model will map a related entity (e.g. DOSAGE) to at most one of the instances of the target entity. For example, if there is a sentence "The patient was given 125 mg of paracetamol and metformin", the model will pair "125 mg" to "paracetamol", but not to "metformin". The output of the model is an average embeddings of the chunks of each of the target entities and their related entities. It is possible to specify a particular weight for each entity type. An entity can be defined both as target a entity and as a related entity for some other target entity. For example, we may want to compute the embeddings of SYMPTOMs and their related entities, as well as the embeddings of DRUGs and their related entities, one of each is also SYMPTOM. In such cases, it is possible to use the TARGET_ENTITY:RELATED_ENTITY notation to specify the weight of an related entity (e.g. "DRUG:SYMPTOM" to set the weight of SYMPTOM when it appears as an related entity to target entity DRUG). The relative weights of entities for particular entity chunk embeddings are available in the annotations metadata.

This model is a subclass of BertSentenceEmbeddings and shares all parameters with it. It can load any pretrained BertSentenceEmbeddings model. Available models can be found at Models Hub.

Two input columns are required - chunks and dependency annotations.

val embeddings = EntityChunkEmbeddings.pretrained()
  .setInputCols("sentence", "dependencies")
  .setOutputCol("entity_chunk_embeddings")

The default model is "sbiobert_base_cased_mli" from "clinical/models".

Sources :

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.SentenceDetector
import com.johnsnowlabs.nlp.annotators.parser.dep.DependencyParserModel
import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronModel
import com.johnsnowlabs.nlp.annotators.ner.{MedicalNerModel, NerConverterInternal}
import com.johnsnowlabs.nlp.annotators.embeddings.EntityChunkEmbeddings
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
   .setInputCol("text")
   .setOutputCol("document")

val sentenceDetector = new SentenceDetector()
   .setInputCols("document")
   .setOutputCol("sentence")

val tokenizer = new Tokenizer()
   .setInputCols("sentence")
   .setOutputCol("tokens")

 val wordEmbeddings = WordEmbeddingsModel
   .pretrained("embeddings_clinical", "en", "clinical/models")
   .setInputCols(Array("sentences", "tokens"))
   .setOutputCol("word_embeddings")

 val nerModel = MedicalNerModel
   .pretrained("ner_posology_large", "en", "clinical/models")
   .setInputCols(Array("sentence", "tokens", "word_embeddings"))
   .setOutputCol("ner")

 val nerConverter = new NerConverterInternal()
   .setInputCols("sentence", "tokens", "ner")
   .setOutputCol("ner_chunk")

 val posTager = PerceptronModel
   .pretrained("pos_clinical", "en", "clinical/models")
   .setInputCols("sentences", "tokens")
   .setOutputCol("pos_tags")

 val dependencyParser = DependencyParserModel
   .pretrained("dependency_conllu", "en")
   .setInputCols(Array("sentences", "pos_tags", "tokens"))
   .setOutputCol("dependencies")

 val drugChunkEmbeddings = EntityChunkEmbeddings
   .pretrained("sbiobert_base_cased_mli","en","clinical/models")
   .setInputCols(Array("ner_chunks", "dependencies"))
   .setOutputCol("drug_chunk_embeddings")
   .setMaxSyntacticDistance(3)
   .setTargetEntities(Map("DRUG" -> List()))
   .setEntityWeights(Map[String, Float]("DRUG" -> 0.8f, "STRENGTH" -> 0.2f, "DOSAGE" -> 0.2f, "FORM" -> 0.5f))

val pipeline = new Pipeline()
     .setStages(Array(
         documentAssembler,
         sentenceDetector,
         tokenizer,
         wordEmbeddings,
         nerModel,
         nerConverter,
         posTager,
         dependencyParser,
         drugChunkEmbeddings))

val sampleText = "The patient was given metformin 125 mg, 250 mg of coumadin and then one pill paracetamol."

val testDataset = Seq("").toDS.toDF("text")
val result = pipeline.fit(emptyDataset).transform(testDataset)

result
   .selectExpr("explode(drug_chunk_embeddings) AS drug_chunk")
   .selectExpr("drug_chunk.result", "slice(drug_chunk.embeddings, 1, 5) AS drugEmbedding")
   .show(truncate=false)

+-----------------------------+-----------------------------------------------------------------+
|                       result|                                                    drugEmbedding|
+-----------------------------+-----------------------------------------------------------------+
|metformin 125 mg             |[-0.267413, 0.07614058, -0.5620966, 0.83838946, 0.8911504]       |
|250 mg coumadin              |[0.22319649, -0.07094894, -0.6885556, 0.79176235, 0.82672405]    |
|one pill paracetamol          |[-0.10939768, -0.29242, -0.3574444, 0.3981813, 0.79609615]      |
+-----------------------------+----------------------------------------------------------------+

See also: BertEmbeddings for token-level embeddings
BertSentenceEmbeddings for sentence-level embeddings
Annotators Main Page for a list of transformer based embeddings

Linear Supertypes

CheckLicense, BertSentenceEmbeddings, HasEngine, HasCaseSensitiveProperties, HasStorageRef, HasEmbeddingsProperties, HasProtectedParams, WriteOnnxModel, WriteOpenvinoModel, WriteTensorflowModel, HasBatchedAnnotate[BertSentenceEmbeddings], AnnotatorModel[BertSentenceEmbeddings], CanBeLazy, RawAnnotator[BertSentenceEmbeddings], HasOutputAnnotationCol, HasInputAnnotationCols, HasOutputAnnotatorType, ParamsAndFeaturesWritable, HasFeatures, DefaultParamsWritable, MLWritable, Model[BertSentenceEmbeddings], Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any

Ordering

Grouped
Alphabetic
By Inheritance

Inherited

EntityChunkEmbeddings
CheckLicense
BertSentenceEmbeddings
HasEngine
HasCaseSensitiveProperties
HasStorageRef
HasEmbeddingsProperties
HasProtectedParams
WriteOnnxModel
WriteOpenvinoModel
WriteTensorflowModel
HasBatchedAnnotate
AnnotatorModel
CanBeLazy
RawAnnotator
HasOutputAnnotationCol
HasInputAnnotationCols
HasOutputAnnotatorType
ParamsAndFeaturesWritable
HasFeatures
DefaultParamsWritable
MLWritable
Model
Transformer
PipelineStage
Logging
Params
Serializable
Serializable
Identifiable
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new EntityChunkEmbeddings()
new EntityChunkEmbeddings(uid: String)

Type Members

type AnnotationContent = Seq[Row]

Attributes
protected
Definition Classes
AnnotatorModel
type AnnotatorType = String

Definition Classes
HasOutputAnnotatorType
implicit class ProtectedParam[T] extends Param[T]

Definition Classes
HasProtectedParams

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def $[T](param: Param[T]): T

Attributes
protected
Definition Classes
Params
def $$[T](feature: StructFeature[T]): T

Attributes
protected
Definition Classes
HasFeatures
def $$[K, V](feature: MapFeature[K, V]): Map[K, V]

Attributes
protected
Definition Classes
HasFeatures
def $$[T](feature: SetFeature[T]): Set[T]

Attributes
protected
Definition Classes
HasFeatures
def $$[T](feature: ArrayFeature[T]): Array[T]

Attributes
protected
Definition Classes
HasFeatures
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def _transform(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): DataFrame

Attributes
protected
Definition Classes
AnnotatorModel
def afterAnnotate(dataset: DataFrame): DataFrame

Attributes
protected
Definition Classes
BertSentenceEmbeddings → AnnotatorModel
def areSameTargetEntityAnnotations(targetEntityAnno1: Annotation, targetEntityAnno2: Annotation): Boolean

Attributes
protected
final def asInstanceOf[T0]: T0

Definition Classes
Any
def averageEmbeddings(embeddings: Array[Array[Float]], weights: Array[Float]): Array[Float]

Attributes
protected
def batchAnnotate(batchedAnnotations: Seq[Array[Annotation]]): Seq[Seq[Annotation]]

Definition Classes
EntityChunkEmbeddings → BertSentenceEmbeddings → HasBatchedAnnotate
def batchProcess(rows: Iterator[_]): Iterator[Row]

Definition Classes
HasBatchedAnnotate
val batchSize: IntParam

Definition Classes
HasBatchedAnnotate
def beforeAnnotate(dataset: Dataset[_]): Dataset[_]

Attributes
protected
Definition Classes
AnnotatorModel
val caseSensitive: BooleanParam

Definition Classes
HasCaseSensitiveProperties
final def checkSchema(schema: StructType, inputAnnotatorType: String): Boolean

Attributes
protected
Definition Classes
HasInputAnnotationCols
def checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String]): Unit

Definition Classes
CheckLicense
def checkValidScope(scope: String): Unit

Definition Classes
CheckLicense
def checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean): Unit

Definition Classes
CheckLicense
def checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean): Unit

Definition Classes
CheckLicense
final def clear(param: Param[_]): EntityChunkEmbeddings.this.type

Definition Classes
Params
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
val configProtoBytes: IntArrayParam

Definition Classes
BertSentenceEmbeddings
def copy(extra: ParamMap): BertSentenceEmbeddings

Definition Classes
RawAnnotator → Model → Transformer → PipelineStage → Params
def copyValues[T <: Params](to: T, extra: ParamMap): T

Attributes
protected
Definition Classes
Params
def createDatabaseConnection(database: Name): RocksDBConnection

Definition Classes
HasStorageRef
final def defaultCopy[T <: Params](extra: ParamMap): T

Attributes
protected
Definition Classes
Params
val dimension: ProtectedParam[Int]

Definition Classes
HasEmbeddingsProperties
val engine: Param[String]

Definition Classes
HasEngine
val entityWeights: MapFeature[String, Float]
The relative weights of drug related entities.
The relative weights of drug related entities. If not set, all entities have equal weights. If the list is non-empty and some entity is not in it, then its weight is set to 0. The notation TARGET_ENTITY:RELATED_ENTITY can be used to specify the weight of a entity which is related to specific target entity (e.g. "DRUG:SYMPTOM" -> 0.3f). Entity names are case insensitive.
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def explainParam(param: Param[_]): String

Definition Classes
Params
def explainParams(): String

Definition Classes
Params
def extraValidate(structType: StructType): Boolean

Attributes
protected
Definition Classes
RawAnnotator
def extraValidateMsg: String

Attributes
protected
Definition Classes
RawAnnotator
final def extractParamMap(): ParamMap

Definition Classes
Params
final def extractParamMap(extra: ParamMap): ParamMap

Definition Classes
Params
val features: ArrayBuffer[Feature[_, _, _]]

Definition Classes
HasFeatures
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def get[T](feature: StructFeature[T]): Option[T]

Attributes
protected
Definition Classes
HasFeatures
def get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]

Attributes
protected
Definition Classes
HasFeatures
def get[T](feature: SetFeature[T]): Option[Set[T]]

Attributes
protected
Definition Classes
HasFeatures
def get[T](feature: ArrayFeature[T]): Option[Array[T]]

Attributes
protected
Definition Classes
HasFeatures
final def get[T](param: Param[T]): Option[T]

Definition Classes
Params
def getAnnotationEmbeddingWeight(targetEntityAnno: Annotation, anno: Annotation): Float

Attributes
protected
def getBatchSize: Int

Definition Classes
HasBatchedAnnotate
def getCaseSensitive: Boolean

Definition Classes
HasCaseSensitiveProperties
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getConfigProtoBytes: Option[Array[Byte]]

Definition Classes
BertSentenceEmbeddings
final def getDefault[T](param: Param[T]): Option[T]

Definition Classes
Params
def getDimension: Int

Definition Classes
HasEmbeddingsProperties
def getEngine: String

Definition Classes
HasEngine
def getEntityWeight(entity: String): Float
def getEntityWeights: Map[String, Float]
def getInputCols: Array[String]

Definition Classes
HasInputAnnotationCols
def getIsLong: Boolean

Definition Classes
BertSentenceEmbeddings
def getLazyAnnotator: Boolean

Definition Classes
CanBeLazy
def getMaxSentenceLength: Int

Definition Classes
BertSentenceEmbeddings
def getMaxSyntacticDistance: Int
Maximal syntactic distance, as threshold (Default: 0)
def getModelIfNotSet: Bert

Definition Classes
BertSentenceEmbeddings
final def getOrDefault[T](param: Param[T]): T

Definition Classes
Params
final def getOutputCol: String

Definition Classes
HasOutputAnnotationCol
def getParam(paramName: String): Param[Any]

Definition Classes
Params
def getSignatures: Option[Map[String, String]]

Definition Classes
BertSentenceEmbeddings
def getStorageRef: String

Definition Classes
HasStorageRef
def getSyntacticDistance(anno1: Annotation, anno2: Annotation, deps: Array[Annotation]): Int

Attributes
protected
def getTargetEntities: Map[String, List[String]]
def getTargetEntity(entity: String): Option[List[String]]
final def hasDefault[T](param: Param[T]): Boolean

Definition Classes
Params
def hasParam(paramName: String): Boolean

Definition Classes
Params
def hasParent: Boolean

Definition Classes
Model
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

Attributes
protected
Definition Classes
Logging
def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes
protected
Definition Classes
Logging
val inputAnnotatorTypes: Array[AnnotatorType]
Input annotator types: CHUNK, DEPENDENCY
Input annotator types: CHUNK, DEPENDENCY

Definition Classes
EntityChunkEmbeddings → BertSentenceEmbeddings → HasInputAnnotationCols
final val inputCols: StringArrayParam

Attributes
protected
Definition Classes
HasInputAnnotationCols
final def isDefined(param: Param[_]): Boolean

Definition Classes
Params
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val isLong: ProtectedParam[Boolean]

Definition Classes
BertSentenceEmbeddings
final def isSet(param: Param[_]): Boolean

Definition Classes
Params
def isTargetEntityAnnotation(anno: Annotation): Boolean

Attributes
protected
def isTargetEntityRelatedAnnotation(targetEntityAnno: Annotation, anno: Annotation): Boolean

Attributes
protected
def isTraceEnabled(): Boolean

Attributes
protected
Definition Classes
Logging
val lazyAnnotator: BooleanParam

Definition Classes
CanBeLazy
def log: Logger

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logName: String

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
val maxSentenceLength: IntParam

Definition Classes
BertSentenceEmbeddings
val maxSyntacticDistance: IntParam
Maximal syntactic distance between the drug entity and the other drug related entities.
Maximal syntactic distance between the drug entity and the other drug related entities. Default value is 2.
def msgHelper(schema: StructType): String

Attributes
protected
Definition Classes
HasInputAnnotationCols
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def onWrite(path: String, spark: SparkSession): Unit

Definition Classes
BertSentenceEmbeddings → ParamsAndFeaturesWritable
val optionalInputAnnotatorTypes: Array[String]

Definition Classes
HasInputAnnotationCols
val outputAnnotatorType: AnnotatorType
Output annotator types: CHUNK
Output annotator types: CHUNK

Definition Classes
EntityChunkEmbeddings → BertSentenceEmbeddings → HasOutputAnnotatorType
final val outputCol: Param[String]

Attributes
protected
Definition Classes
HasOutputAnnotationCol
lazy val params: Array[Param[_]]

Definition Classes
Params
var parent: Estimator[BertSentenceEmbeddings]

Definition Classes
Model
def save(path: String): Unit

Definition Classes
MLWritable
Annotations
@Since( "1.6.0" ) @throws( ... )
def sentenceEndTokenId: Int

Definition Classes
BertSentenceEmbeddings
def sentenceStartTokenId: Int

Definition Classes
BertSentenceEmbeddings
def set[T](param: ProtectedParam[T], value: T): EntityChunkEmbeddings.this.type

Definition Classes
HasProtectedParams
def set[T](feature: StructFeature[T], value: T): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
def set[K, V](feature: MapFeature[K, V], value: Map[K, V]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
def set[T](feature: SetFeature[T], value: Set[T]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
def set[T](feature: ArrayFeature[T], value: Array[T]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
final def set(paramPair: ParamPair[_]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
Params
final def set(param: String, value: Any): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
Params
final def set[T](param: Param[T], value: T): EntityChunkEmbeddings.this.type

Definition Classes
Params
def setBatchSize(size: Int): EntityChunkEmbeddings.this.type

Definition Classes
HasBatchedAnnotate
def setCaseSensitive(value: Boolean): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings → HasCaseSensitiveProperties
def setConfigProtoBytes(bytes: Array[Int]): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings
def setDefault[T](feature: StructFeature[T], value: () ⇒ T): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
def setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
def setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
def setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
HasFeatures
final def setDefault(paramPairs: ParamPair[_]*): EntityChunkEmbeddings.this.type

Attributes
protected
Definition Classes
Params
final def setDefault[T](param: Param[T], value: T): EntityChunkEmbeddings.this.type

Attributes
protected[org.apache.spark.ml]
Definition Classes
Params
def setDimension(value: Int): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings → HasEmbeddingsProperties
def setEntityWeights(w: HashMap[String, Double]): EntityChunkEmbeddings.this.type
def setEntityWeights(weights: Map[String, Float]): EntityChunkEmbeddings.this.type
Sets the wieght of the chunk embeddings relative to the sentence embeddings.
Sets the wieght of the chunk embeddings relative to the sentence embeddings. The value should between 0 and 1.
final def setInputCols(value: String*): EntityChunkEmbeddings.this.type

Definition Classes
HasInputAnnotationCols
def setInputCols(value: Array[String]): EntityChunkEmbeddings.this.type

Definition Classes
HasInputAnnotationCols
def setIsLong(value: Boolean): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings
def setLazyAnnotator(value: Boolean): EntityChunkEmbeddings.this.type

Definition Classes
CanBeLazy
def setMaxSentenceLength(value: Int): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings
def setMaxSyntacticDistance(maxSyntacticDistance: Int): EntityChunkEmbeddings.this.type
Set the ,aximal syntactic distance
def setModelIfNotSet(spark: SparkSession, tensorflowWrapper: Option[TensorflowWrapper], onnxWrapper: Option[OnnxWrapper], openvinoWrapper: Option[OpenvinoWrapper]): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings
final def setOutputCol(value: String): EntityChunkEmbeddings.this.type

Definition Classes
HasOutputAnnotationCol
def setParent(parent: Estimator[BertSentenceEmbeddings]): BertSentenceEmbeddings

Definition Classes
Model
def setSignatures(value: Map[String, String]): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings
def setStorageRef(value: String): EntityChunkEmbeddings.this.type

Definition Classes
HasStorageRef
def setTargetEntities(entities: HashMap[String, List[String]]): EntityChunkEmbeddings.this.type
def setTargetEntities(entities: Map[String, List[String]]): EntityChunkEmbeddings.this.type
def setVocabulary(value: Map[String, Int]): EntityChunkEmbeddings.this.type

Definition Classes
BertSentenceEmbeddings
val signatures: MapFeature[String, String]

Definition Classes
BertSentenceEmbeddings
val storageRef: Param[String]

Definition Classes
HasStorageRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
val targetEntities: MapFeature[String, List[String]]
The target entities mapped to lists of their related entities.
The target entities mapped to lists of their related entities. A target entity with an empty list of related entities means all other entities are assumed to be related to it. Entity names are case insensitive.
def toString(): String

Definition Classes
Identifiable → AnyRef → Any
def tokenize(sentences: Seq[Sentence]): Seq[WordpieceTokenizedSentence]

Definition Classes
BertSentenceEmbeddings
final def transform(dataset: Dataset[_]): DataFrame

Definition Classes
AnnotatorModel → Transformer
def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame

Definition Classes
Transformer
Annotations
@Since( "2.0.0" )
def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame

Definition Classes
Transformer
Annotations
@Since( "2.0.0" ) @varargs()
final def transformSchema(schema: StructType): StructType

Definition Classes
RawAnnotator → PipelineStage
def transformSchema(schema: StructType, logging: Boolean): StructType

Attributes
protected
Definition Classes
PipelineStage
Annotations
@DeveloperApi()
val uid: String

Definition Classes
EntityChunkEmbeddings → BertSentenceEmbeddings → Identifiable
def validate(schema: StructType): Boolean

Attributes
protected
Definition Classes
RawAnnotator
def validateStorageRef(dataset: Dataset[_], inputCols: Array[String], annotatorType: String): Unit

Definition Classes
HasStorageRef
val vocabulary: MapFeature[String, Int]

Definition Classes
BertSentenceEmbeddings
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
def wrapColumnMetadata(col: Column): Column

Attributes
protected
Definition Classes
RawAnnotator
def wrapEmbeddingsMetadata(col: Column, embeddingsDim: Int, embeddingsRef: Option[String]): Column

Attributes
protected
Definition Classes
HasEmbeddingsProperties
def wrapSentenceEmbeddingsMetadata(col: Column, embeddingsDim: Int, embeddingsRef: Option[String]): Column

Attributes
protected
Definition Classes
HasEmbeddingsProperties
def write: MLWriter

Definition Classes
ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable
def writeOnnxModel(path: String, spark: SparkSession, onnxWrapper: OnnxWrapper, suffix: String, fileName: String): Unit

Definition Classes
WriteOnnxModel
def writeOnnxModels(path: String, spark: SparkSession, onnxWrappersWithNames: Seq[(OnnxWrapper, String)], suffix: String): Unit

Definition Classes
WriteOnnxModel
def writeOpenvinoModel(path: String, spark: SparkSession, openvinoWrapper: OpenvinoWrapper, suffix: String, fileName: String): Unit

Definition Classes
WriteOpenvinoModel
def writeOpenvinoModels(path: String, spark: SparkSession, ovWrappersWithNames: Seq[(OpenvinoWrapper, String)], suffix: String): Unit

Definition Classes
WriteOpenvinoModel
def writeTensorflowHub(path: String, tfPath: String, spark: SparkSession, suffix: String): Unit

Definition Classes
WriteTensorflowModel
def writeTensorflowModel(path: String, spark: SparkSession, tensorflow: TensorflowWrapper, suffix: String, filename: String, configProtoBytes: Option[Array[Byte]]): Unit

Definition Classes
WriteTensorflowModel
def writeTensorflowModelV2(path: String, spark: SparkSession, tensorflow: TensorflowWrapper, suffix: String, filename: String, configProtoBytes: Option[Array[Byte]], savedSignatures: Option[Map[String, String]]): Unit

Definition Classes
WriteTensorflowModel

Packages

EntityChunkEmbeddings 

Companion object EntityChunkEmbeddings

class EntityChunkEmbeddings extends BertSentenceEmbeddings with CheckLicense

Example

Instance Constructors

Type Members

Value Members

Inherited from CheckLicense

Inherited from BertSentenceEmbeddings

Inherited from HasEngine

Inherited from HasCaseSensitiveProperties

Inherited from HasStorageRef

Inherited from HasEmbeddingsProperties

Inherited from HasProtectedParams

Inherited from WriteOnnxModel

Inherited from WriteOpenvinoModel

Inherited from WriteTensorflowModel

Inherited from HasBatchedAnnotate[BertSentenceEmbeddings]

Inherited from AnnotatorModel[BertSentenceEmbeddings]

Inherited from CanBeLazy

Inherited from RawAnnotator[BertSentenceEmbeddings]

Inherited from HasOutputAnnotationCol

Inherited from HasInputAnnotationCols

Inherited from HasOutputAnnotatorType

Inherited from ParamsAndFeaturesWritable

Inherited from HasFeatures

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Model[BertSentenceEmbeddings]

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

anno

getParam

param

setParam

Ungrouped

EntityChunkEmbeddings