class AssertionDLApproach extends nlp.annotators.assertion.dl.AssertionDLApproach
Contains all the methods for training an AssertionDLModel. For pretrained models please use AssertionDLModel and see the Models Hub for available models.
Example
First, pipeline stages for pre-processing the dataset (containing columns for text and label) are defined.
val document = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") val chunk = new Doc2Chunk() .setInputCols("document") .setOutputCol("chunk") val token = new Tokenizer() .setInputCols("document") .setOutputCol("token") val embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models") .setInputCols("document", "token") .setOutputCol("embeddings")
Define AssertionDLApproach with parameters and start training
val assertionStatus = new AssertionDLApproach() .setLabelCol("label") .setInputCols("document", "chunk", "embeddings") .setOutputCol("assertion") .setBatchSize(128) .setDropout(0.012f) .setLearningRate(0.015f) .setEpochs(1) .setStartCol("start") .setEndCol("end") .setMaxSentLen(250) val trainingPipeline = new Pipeline().setStages(Array( document, chunk, token, embeddings, assertionStatus )) val assertionModel = trainingPipeline.fit(data) val assertionResults = assertionModel.transform(data).cache()
- See also
AssertionDLModel for using pretrained models
AssertionLogRegModel for non deep learning based extraction
- Grouped
- Alphabetic
- By Inheritance
- AssertionDLApproach
- AssertionDLApproach
- CheckLicense
- HandleExceptionParams
- Logging
- AssertionDLParams
- AnnotatorApproach
- CanBeLazy
- DefaultParamsWritable
- MLWritable
- HasOutputAnnotatorType
- HasOutputAnnotationCol
- HasInputAnnotationCols
- Estimator
- PipelineStage
- Logging
- Params
- Serializable
- Serializable
- Identifiable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
Type Members
-
type
AnnotatorType = String
- Definition Classes
- HasOutputAnnotatorType
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
$[T](param: Param[T]): T
- Attributes
- protected
- Definition Classes
- Params
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
_fit(dataset: Dataset[_], recursiveStages: Option[PipelineModel]): nlp.annotators.assertion.dl.AssertionDLModel
- Attributes
- protected
- Definition Classes
- AnnotatorApproach
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
val
batchSize: IntParam
Size for each batch in the optimization process (Default: 64)
Size for each batch in the optimization process (Default: 64)
- Definition Classes
- AssertionDLParams
-
def
beforeTraining(spark: SparkSession): Unit
- Definition Classes
- AnnotatorApproach
-
final
def
checkSchema(schema: StructType, inputAnnotatorType: String): Boolean
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
def
checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String]): Unit
- Definition Classes
- CheckLicense
-
def
checkValidScope(scope: String): Unit
- Definition Classes
- CheckLicense
-
def
checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean): Unit
- Definition Classes
- CheckLicense
-
def
checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean): Unit
- Definition Classes
- CheckLicense
-
val
chunkCol: Param[String]
Column with extracted NER chunks
Column with extracted NER chunks
- Definition Classes
- AssertionDLParams
-
final
def
clear(param: Param[_]): AssertionDLApproach.this.type
- Definition Classes
- Params
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
val
configProtoBytes: IntArrayParam
ConfigProto from tensorflow, serialized into byte array.
ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()
- Definition Classes
- AssertionDLParams
-
final
def
copy(extra: ParamMap): Estimator[nlp.annotators.assertion.dl.AssertionDLModel]
- Definition Classes
- AnnotatorApproach → Estimator → PipelineStage → Params
-
def
copyValues[T <: Params](to: T, extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
val
datasetInfo: Param[String]
Descriptive information about the dataset being used.
Descriptive information about the dataset being used.
- Definition Classes
- AssertionDLParams
-
final
def
defaultCopy[T <: Params](extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
val
description: String
- Definition Classes
- AssertionDLApproach → AnnotatorApproach
-
val
doExceptionHandling: BooleanParam
If true, exceptions are handled.
If true, exceptions are handled. If exception causing data is passed to the model, a error annotation is emitted which has the exception message. Processing continues with the next one. This comes with a performance penalty.
- Definition Classes
- HandleExceptionParams
-
val
dropout: FloatParam
Dropout at the output of each layer (Default: 0.05f)
Dropout at the output of each layer (Default: 0.05f)
- Definition Classes
- AssertionDLParams
-
val
enableOutputLogs: BooleanParam
Whether to output to annotators log folder (Default: false)
Whether to output to annotators log folder (Default: false)
- Definition Classes
- AssertionDLParams
-
val
endCol: Param[String]
Column with token number for last target token
Column with token number for last target token
- Definition Classes
- AssertionDLParams
-
val
epochs: IntParam
Number of epochs for the optimization process (Default: 5)
Number of epochs for the optimization process (Default: 5)
- Definition Classes
- AssertionDLParams
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
explainParam(param: Param[_]): String
- Definition Classes
- Params
-
def
explainParams(): String
- Definition Classes
- Params
-
final
def
extractParamMap(): ParamMap
- Definition Classes
- Params
-
final
def
extractParamMap(extra: ParamMap): ParamMap
- Definition Classes
- Params
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
fit(dataset: Dataset[_]): nlp.annotators.assertion.dl.AssertionDLModel
- Definition Classes
- AnnotatorApproach → Estimator
-
def
fit(dataset: Dataset[_], paramMaps: Seq[ParamMap]): Seq[nlp.annotators.assertion.dl.AssertionDLModel]
- Definition Classes
- Estimator
- Annotations
- @Since( "2.0.0" )
-
def
fit(dataset: Dataset[_], paramMap: ParamMap): nlp.annotators.assertion.dl.AssertionDLModel
- Definition Classes
- Estimator
- Annotations
- @Since( "2.0.0" )
-
def
fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): nlp.annotators.assertion.dl.AssertionDLModel
- Definition Classes
- Estimator
- Annotations
- @Since( "2.0.0" ) @varargs()
-
final
def
get[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getConfigProtoBytes: Option[Array[Byte]]
ConfigProto from tensorflow, serialized into byte array.
ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()
- Definition Classes
- AssertionDLParams
-
def
getDatasetInfo: String
get descriptive information about the dataset being used
get descriptive information about the dataset being used
- Definition Classes
- AssertionDLParams
-
final
def
getDefault[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
def
getEnableOutputLogs: Boolean
Whether to output to annotators log folder
Whether to output to annotators log folder
- Definition Classes
- AssertionDLParams
-
def
getIncludeConfidence: Boolean
whether to include confidence scores in annotation metadata
whether to include confidence scores in annotation metadata
- Definition Classes
- AssertionDLParams
-
def
getInputCols: Array[String]
- Definition Classes
- HasInputAnnotationCols
-
def
getLazyAnnotator: Boolean
- Definition Classes
- CanBeLazy
-
def
getLogName: String
- Definition Classes
- Logging
-
final
def
getOrDefault[T](param: Param[T]): T
- Definition Classes
- Params
-
final
def
getOutputCol: String
- Definition Classes
- HasOutputAnnotationCol
-
def
getOutputLogsPath: String
Folder path to save training logs
Folder path to save training logs
- Definition Classes
- AssertionDLParams
-
def
getParam(paramName: String): Param[Any]
- Definition Classes
- Params
-
def
getScopeWindow: (Int, Int)
Get scope window
Get scope window
- Definition Classes
- AssertionDLParams
-
val
graphFile: Param[String]
File path that contain external graph file.
File path that contain external graph file. When specified, the provided file will be used, and no graph search will happen. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).
- Definition Classes
- AssertionDLParams
-
val
graphFolder: Param[String]
Folder path that contain external graph files.
Folder path that contain external graph files.
Folder path that contain external graph files. The path can a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).
- Definition Classes
- AssertionDLParams
-
final
def
hasDefault[T](param: Param[T]): Boolean
- Definition Classes
- Params
-
def
hasParam(paramName: String): Boolean
- Definition Classes
- Params
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
includeConfidence: BooleanParam
Whether to include confidence scores in annotation metadata
Whether to include confidence scores in annotation metadata
- Definition Classes
- AssertionDLParams
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
val
inputAnnotatorTypes: Array[String]
Input Annotator Types: DOCUMENT, CHUNK, WORD_EMBEDDINGS
Input Annotator Types: DOCUMENT, CHUNK, WORD_EMBEDDINGS
- Definition Classes
- AssertionDLApproach → HasInputAnnotationCols
-
final
val
inputCols: StringArrayParam
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
isDefined(param: Param[_]): Boolean
- Definition Classes
- Params
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
isSet(param: Param[_]): Boolean
- Definition Classes
- Params
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
val
labelCol: Param[String]
Column with one label per document.
Column with one label per document. Example of possible values: “present”, “absent”, “hypothetical”, “conditional”, “associated_with_other_person”, etc.
- Definition Classes
- AssertionDLParams
-
val
lazyAnnotator: BooleanParam
- Definition Classes
- CanBeLazy
-
val
learningRate: FloatParam
Learning rate for the optimization process (Default: 0.0012f)
Learning rate for the optimization process (Default: 0.0012f)
- Definition Classes
- AssertionDLParams
-
def
log(value: ⇒ String, minLevel: Level): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
val
logger: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
val
maxSentLen: IntParam
Max possible length of a sentence, must match graph model (Default: 250)
Max possible length of a sentence, must match graph model (Default: 250)
- Definition Classes
- AssertionDLParams
-
def
msgHelper(schema: StructType): String
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
onTrained(model: nlp.annotators.assertion.dl.AssertionDLModel, spark: SparkSession): Unit
- Definition Classes
- AnnotatorApproach
-
val
optionalInputAnnotatorTypes: Array[String]
- Definition Classes
- HasInputAnnotationCols
-
val
outputAnnotatorType: AnnotatorType
Output annotator type: ASSERTION
Output annotator type: ASSERTION
- Definition Classes
- AssertionDLApproach → HasOutputAnnotatorType
-
final
val
outputCol: Param[String]
- Attributes
- protected
- Definition Classes
- HasOutputAnnotationCol
-
def
outputLog(value: ⇒ String, uuid: String, shouldLog: Boolean, outputLogsPath: String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
val
outputLogsPath: Param[String]
Folder path to save training logs.
Folder path to save training logs. If no path is specified, the logs won't be stored in disk. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).
- Definition Classes
- AssertionDLParams
-
lazy val
params: Array[Param[_]]
- Definition Classes
- Params
-
def
save(path: String): Unit
- Definition Classes
- MLWritable
- Annotations
- @Since( "1.6.0" ) @throws( ... )
-
val
scopeWindow: IntArrayParam
The scope window of the assertion (whole sentence by default)
The scope window of the assertion (whole sentence by default)
- Definition Classes
- AssertionDLParams
-
final
def
set(paramPair: ParamPair[_]): AssertionDLApproach.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set(param: String, value: Any): AssertionDLApproach.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set[T](param: Param[T], value: T): AssertionDLApproach.this.type
- Definition Classes
- Params
-
def
setBatchSize(size: Int): AssertionDLApproach.this.type
Size for each batch in the optimization process
Size for each batch in the optimization process
- Definition Classes
- AssertionDLParams
-
def
setChunkCol(c: String): AssertionDLApproach.this.type
Column with extracted NER chunks
Column with extracted NER chunks
- Definition Classes
- AssertionDLParams
-
def
setConfigProtoBytes(bytes: Array[Int]): AssertionDLApproach.this.type
ConfigProto from tensorflow, serialized into byte array.
ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()
- Definition Classes
- AssertionDLParams
-
def
setDatasetInfo(value: String): AssertionDLApproach.this.type
set descriptive information about the dataset being used
set descriptive information about the dataset being used
- Definition Classes
- AssertionDLParams
-
final
def
setDefault(paramPairs: ParamPair[_]*): AssertionDLApproach.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
setDefault[T](param: Param[T], value: T): AssertionDLApproach.this.type
- Attributes
- protected[org.apache.spark.ml]
- Definition Classes
- Params
-
def
setDoExceptionHandling(value: Boolean): AssertionDLApproach.this.type
If true, exceptions are handled.
If true, exceptions are handled. If exception causing data is passed to the model, a error annotation is emitted which has the exception message. Processing continues with the next one. This comes with a performance penalty.
- Definition Classes
- HandleExceptionParams
-
def
setDropout(factor: Float): AssertionDLApproach.this.type
Dropout at the output of each layer
Dropout at the output of each layer
- Definition Classes
- AssertionDLParams
-
def
setEnableOutputLogs(v: Boolean): AssertionDLApproach.this.type
Whether to output to annotators log folder
Whether to output to annotators log folder
- Definition Classes
- AssertionDLParams
-
def
setEndCol(e: String): AssertionDLApproach.this.type
Column with token number for last target token
Column with token number for last target token
- Definition Classes
- AssertionDLParams
-
def
setEpochs(number: Int): AssertionDLApproach.this.type
Number of epochs for the optimization process
Number of epochs for the optimization process
- Definition Classes
- AssertionDLParams
-
def
setGraphFile(path: String): AssertionDLApproach.this.type
Folder path that contain external graph files
Folder path that contain external graph files
- Definition Classes
- AssertionDLParams
-
def
setGraphFolder(path: String): AssertionDLApproach.this.type
Folder path that contain external graph files
Folder path that contain external graph files
- Definition Classes
- AssertionDLParams
-
def
setIncludeConfidence(value: Boolean): AssertionDLApproach.this.type
Whether to include confidence scores in annotation metadata
Whether to include confidence scores in annotation metadata
- Definition Classes
- AssertionDLParams
-
final
def
setInputCols(value: String*): AssertionDLApproach.this.type
- Definition Classes
- HasInputAnnotationCols
-
def
setInputCols(value: Array[String]): AssertionDLApproach.this.type
- Definition Classes
- HasInputAnnotationCols
-
def
setLabelCol(label: String): AssertionDLApproach.this.type
Column with one label per document
Column with one label per document
- Definition Classes
- AssertionDLParams
-
def
setLazyAnnotator(value: Boolean): AssertionDLApproach.this.type
- Definition Classes
- CanBeLazy
-
def
setLearningRate(rate: Float): AssertionDLApproach.this.type
Learning rate for the optimization process
Learning rate for the optimization process
- Definition Classes
- AssertionDLParams
-
def
setMaxSentLen(len: Int): AssertionDLApproach.this.type
Max possible length of a sentence, must match graph model
Max possible length of a sentence, must match graph model
- Definition Classes
- AssertionDLParams
-
final
def
setOutputCol(value: String): AssertionDLApproach.this.type
- Definition Classes
- HasOutputAnnotationCol
-
def
setOutputLogsPath(v: String): AssertionDLApproach.this.type
Folder path to save training logs
Folder path to save training logs
- Definition Classes
- AssertionDLParams
-
def
setScopeWindow(window: (Int, Int)): AssertionDLApproach.this.type
Max possible length of a sentence.
Max possible length of a sentence.
- Definition Classes
- AssertionDLParams
-
def
setStartCol(s: String): AssertionDLApproach.this.type
Column with token number for first target token
Column with token number for first target token
- Definition Classes
- AssertionDLParams
-
def
setValidationSplit(validationSplit: Float): AssertionDLApproach.this.type
Choose the proportion of training dataset to be validated against the model on each Epoch.
Choose the proportion of training dataset to be validated against the model on each Epoch. The value should be between 0.0 and 1.0 and by default it is 0.0 and off.
- Definition Classes
- AssertionDLParams
-
def
setVerbose(verbose: Level): AssertionDLApproach.this.type
Level of verbosity during training
Level of verbosity during training
- Definition Classes
- AssertionDLParams
-
val
startCol: Param[String]
Column with token number for first target token
Column with token number for first target token
- Definition Classes
- AssertionDLParams
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
val
testDataset: ExternalResourceParam
Path to test dataset.
Path to test dataset. If set used to calculate statistic on it during training
- Definition Classes
- AssertionDLApproach
-
def
toString(): String
- Definition Classes
- Identifiable → AnyRef → Any
-
def
train(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): AssertionDLModel
Trains the dataset with recursive pipeline and uses methods trainWithChunk() and trainwithStartEnd() The choice of training happens based on the startCol value of the DL Approach
Trains the dataset with recursive pipeline and uses methods trainWithChunk() and trainwithStartEnd() The choice of training happens based on the startCol value of the DL Approach
- dataset
a collection of inputs to train
- recursivePipeline
an instance of PipelineModel
- returns
an instance of trained AssertionDLModel
- Definition Classes
- AssertionDLApproach → AssertionDLApproach → AnnotatorApproach
-
final
def
transformSchema(schema: StructType): StructType
- Definition Classes
- AnnotatorApproach → PipelineStage
-
def
transformSchema(schema: StructType, logging: Boolean): StructType
- Attributes
- protected
- Definition Classes
- PipelineStage
- Annotations
- @DeveloperApi()
-
val
uid: String
- Definition Classes
- AssertionDLApproach → AssertionDLApproach → Identifiable
-
def
validate(schema: StructType): Boolean
- Attributes
- protected
- Definition Classes
- AnnotatorApproach
-
val
validationSplit: FloatParam
The proportion of training dataset to be used as validation set.
The proportion of training dataset to be used as validation set.
The model will be validated against this dataset on each Epoch and will not be used for training. The value should be between 0.0 and 1.0.
- Definition Classes
- AssertionDLParams
-
val
verbose: IntParam
Level of verbosity during training
Level of verbosity during training
- Definition Classes
- AssertionDLParams
-
val
verboseLevel: Level
- Definition Classes
- AssertionDLApproach → Logging
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
write: MLWriter
- Definition Classes
- DefaultParamsWritable → MLWritable
Inherited from nlp.annotators.assertion.dl.AssertionDLApproach
Inherited from CheckLicense
Inherited from HandleExceptionParams
Inherited from Logging
Inherited from AssertionDLParams
Inherited from AnnotatorApproach[nlp.annotators.assertion.dl.AssertionDLModel]
Inherited from CanBeLazy
Inherited from DefaultParamsWritable
Inherited from MLWritable
Inherited from HasOutputAnnotatorType
Inherited from HasOutputAnnotationCol
Inherited from HasInputAnnotationCols
Inherited from Estimator[nlp.annotators.assertion.dl.AssertionDLModel]
Inherited from PipelineStage
Inherited from Logging
Inherited from Params
Inherited from Serializable
Inherited from Serializable
Inherited from Identifiable
Inherited from AnyRef
Inherited from Any
Parameters
Annotator types
Required input and expected output annotator types