trait MedicalNerParams extends Params with HasFeatures
- Grouped
- Alphabetic
- By Inheritance
- MedicalNerParams
- HasFeatures
- Params
- Serializable
- Serializable
- Identifiable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Abstract Value Members
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
$[T](param: Param[T]): T
- Attributes
- protected
- Definition Classes
- Params
-
def
$$[T](feature: StructFeature[T]): T
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[K, V](feature: MapFeature[K, V]): Map[K, V]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[T](feature: SetFeature[T]): Set[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[T](feature: ArrayFeature[T]): Array[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
final
def
clear(param: Param[_]): MedicalNerParams.this.type
- Definition Classes
- Params
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
val
configProtoBytes: IntArrayParam
ConfigProto from tensorflow, serialized into byte array.
ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()
-
def
copyValues[T <: Params](to: T, extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
val
datasetInfo: Param[String]
Descriptive information about the dataset being used.
-
final
def
defaultCopy[T <: Params](extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
val
dropout: FloatParam
Dropout coefficient, by default 0.5.
Dropout coefficient, by default 0.5.
The coefficient of the dropout layer. The value should be between 0.0 and 1.0. Internally, it is used by Tensorflow as:
rate = 1.0 - dropout
when adding a dropout layer on top of the recurrent layers. -
val
earlyStoppingCriterion: FloatParam
If set, this param specifies the criterion to stop training if performance is not improving.
If set, this param specifies the criterion to stop training if performance is not improving.
Default value is 0 which is means that early stopping is not used.
The criterion is set to F1-score if the validationSplit is greater than 0.0 (F1-socre on validation set) or testDataset is defined (F1-score on test set), otherwise it is set to model loss. The priority is as follows: - If testDataset is defined, then the criterion is set to F1-score on test set. - If validationSplit is greater than 0.0, then the criterion is set to F1-score on validation set. - Otherwise, the criterion is set to model loss.
Note that while the F1-score ranges from 0.0 to 1.0, the loss ranges from 0.0 to infinity. So, depending on which case you are in, the value you use for the criterion can be very different. For example, if validationSplit is 0.1, then a criterion of 0.01 means that if the F1-score on the validation set difference from last epoch is greater than 0.01, then the training should stop. However, if there is not validation or test set defined, then a criterion of 2.0 means that if the loss difference between the last epoch and the current one is less than 2.0, then training should stop.
- See also
-
val
earlyStoppingPatience: IntParam
Number of epochs to wait before early stopping if no improvement, by default 5.
Number of epochs to wait before early stopping if no improvement, by default 5.
Given the earlyStoppingCriterion, if the performance does not improve for the given number of epochs, then the training will stop. If the value is 0, then early stopping will occurs as soon as the criterion is met (no patience).
- See also
-
val
enableMemoryOptimizer: BooleanParam
Whether to optimize for large datasets or not.
Whether to optimize for large datasets or not. Enabling this option can slow down training.
In practice, if set to true the training will iterate over the spark Data Frame and retrieve the batches from the Data Frame iterator. This can be slower than the default option as it has to collect the batches on evey bach for every epoch, but it can be useful if the dataset is too large to fit in memory.
It controls if we want the features collected and generated at once and then feed into the network batch by batch (False) or collected and generated by batch and then feed into the network in batches (True) .
If the training data can fit to memory, then it is recommended to set this option to False (default value).
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
explainParam(param: Param[_]): String
- Definition Classes
- Params
-
def
explainParams(): String
- Definition Classes
- Params
-
final
def
extractParamMap(): ParamMap
- Definition Classes
- Params
-
final
def
extractParamMap(extra: ParamMap): ParamMap
- Definition Classes
- Params
-
val
features: ArrayBuffer[Feature[_, _, _]]
- Definition Classes
- HasFeatures
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
get[T](feature: StructFeature[T]): Option[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[T](feature: SetFeature[T]): Option[Set[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[T](feature: ArrayFeature[T]): Option[Array[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
get[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getConfigProtoBytes: Option[Array[Byte]]
ConfigProto from tensorflow, serialized into byte array.
ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()
-
def
getDatasetInfo: String
get descriptive information about the dataset being used
-
final
def
getDefault[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
def
getDropout: Float
Dropout coefficient
-
def
getEarlyStoppingCriterion: Float
Early stopping criterion
-
def
getEarlyStoppingPatience: Int
Early stopping patience
-
def
getEnableMemoryOptimizer: Boolean
Whether to optimize for large datasets or not.
Whether to optimize for large datasets or not. Enabling this option can slow down training.
-
def
getIncludeAllConfidenceScores: Boolean
whether to include all confidence scores in annotation metadata or just the score of the predicted tag
-
def
getIncludeConfidence: Boolean
whether to include confidence scores in annotation metadata
-
def
getLr: Float
Learning Rate
-
final
def
getOrDefault[T](param: Param[T]): T
- Definition Classes
- Params
-
def
getOverrideExistingTags: Boolean
Whether to override already learned tags when using a pretrained model to initialize the new model.
-
def
getParam(paramName: String): Param[Any]
- Definition Classes
- Params
-
def
getPo: Float
Learning rate decay coefficient.
Learning rate decay coefficient. Real Learning Rage = lr / (1 + po * epoch)
-
def
getRandomValidationSplitPerEpoch: Boolean
Checks if a random validation split is done after each epoch or at the beginning of training only.
-
def
getSentenceTokenIndex: Boolean
whether to include the token index for each sentence in annotation metadata.
-
def
getUseBestModel: Boolean
useBestModel
-
def
getUseContrib: Boolean
Whether to use contrib LSTM Cells.
Whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy.
-
val
graphFile: Param[String]
Path that contains the external graph file.
Path that contains the external graph file.
When specified, the provided file will be used, and no graph search will happen. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).
-
val
graphFolder: Param[String]
Folder path that contains external graph files.
Folder path that contains external graph files.
The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).
When instantiating the Tensorflow model, uses this folder to search for the adequate Tensorflow graph. The search is done using the name of the
.pb
file, which should be in this format:blstn_{ntags}_{embedding_dim}_{lstm_size}_{nchars}.pb
.Then, the search follows these rules: - Embedding dimension should be exactly the same as the one used to train the model. - Number of unique tags should be greater than or equal to the number of unique tags in the training data. - Number of unique chars should be greater than or equal to the number of unique chars in the training data.
The returned file will be the first one that satisfies all the conditions.
If the name of the file is ill-formed, errors will occur during training.
-
final
def
hasDefault[T](param: Param[T]): Boolean
- Definition Classes
- Params
-
def
hasParam(paramName: String): Boolean
- Definition Classes
- Params
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
includeAllConfidenceScores: BooleanParam
Whether to include confidence scores for all tags in annotation metadata or just the score of the predicted tag, by default False.
Whether to include confidence scores for all tags in annotation metadata or just the score of the predicted tag, by default False.
Needs the includeConfidence parameter to be set to true.
Enabling this may slow down the inference speed.
-
val
includeConfidence: BooleanParam
Whether to include confidence scores in annotation metadata, by default False.
Whether to include confidence scores in annotation metadata, by default False.
Setting this parameter to True will add the confidence score to the metadata of the NAMED_ENTITY annotation. In addition, if includeAllConfidenceScores is set to true, then the confidence scores of all the tags will be added to the metadata, otherwise only for the predicted tag (the one with maximum score).
-
final
def
isDefined(param: Param[_]): Boolean
- Definition Classes
- Params
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
isSet(param: Param[_]): Boolean
- Definition Classes
- Params
-
val
logPrefix: Param[String]
A prefix that will be appended to every log, default value is empty.
-
val
lr: FloatParam
Learning Rate, by default 0.001.
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
val
overrideExistingTags: BooleanParam
Controls whether to override already learned tags when using a pretrained model to initialize the new model.
Controls whether to override already learned tags when using a pretrained model to initialize the new model. A value of
true
will override existing tags. -
lazy val
params: Array[Param[_]]
- Definition Classes
- Params
-
val
po: FloatParam
Learning rate decay coefficient (time-based).
Learning rate decay coefficient (time-based).
This is used to calculate the decayed learning rate at each step as: lr = lr / (1 + po * epoch), meaning that the value of the learning rate is updated on each epoch. By default 0.005.
-
val
pretrainedModelPath: Param[String]
Path to an already trained MedicalNerModel.
Path to an already trained MedicalNerModel.
This pretrained model will be used as a starting point for training the new one. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).
-
val
randomValidationSplitPerEpoch: BooleanParam
Do a random validation split after each epoch rather than at the beginning of training only.
-
val
sentenceTokenIndex: BooleanParam
whether to include the token index for each sentence in annotation metadata, by default false.
whether to include the token index for each sentence in annotation metadata, by default false. If the value is true, the process might be slowed down.
-
def
set[T](feature: StructFeature[T], value: T): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[K, V](feature: MapFeature[K, V], value: Map[K, V]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[T](feature: SetFeature[T], value: Set[T]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[T](feature: ArrayFeature[T], value: Array[T]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
set(paramPair: ParamPair[_]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set(param: String, value: Any): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set[T](param: Param[T], value: T): MedicalNerParams.this.type
- Definition Classes
- Params
-
def
setConfigProtoBytes(bytes: Array[Int]): MedicalNerParams.this.type
ConfigProto from tensorflow, serialized into byte array.
ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()
-
def
setDatasetInfo(value: String): MedicalNerParams.this.type
set descriptive information about the dataset being used
-
def
setDefault[T](feature: StructFeature[T], value: () ⇒ T): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
setDefault(paramPairs: ParamPair[_]*): MedicalNerParams.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
setDefault[T](param: Param[T], value: T): MedicalNerParams.this.type
- Attributes
- protected[org.apache.spark.ml]
- Definition Classes
- Params
-
def
setDropout(dropout: Float): MedicalNerParams.this.type
Dropout coefficient
- def setEarlyStoppingCriterion(value: Float): MedicalNerParams.this.type
- def setEarlyStoppingPatience(value: Int): MedicalNerParams.this.type
- def setEnableMemoryOptimizer(value: Boolean): MedicalNerParams.this.type
-
def
setGraphFile(path: String): MedicalNerParams.this.type
Folder path that contain external graph files
-
def
setGraphFolder(path: String): MedicalNerParams.this.type
Folder path that contain external graph files
-
def
setIncludeAllConfidenceScores(value: Boolean): MedicalNerParams.this.type
Whether to include confidence scores in annotation metadata
-
def
setIncludeConfidence(value: Boolean): MedicalNerParams.this.type
Whether to include confidence scores for all tags rather than just for the predicted one
-
def
setLogPrefix(value: String): MedicalNerParams.this.type
a string prefix to be included in the logs
-
def
setLr(lr: Float): MedicalNerParams.this.type
Learning Rate
-
def
setOverrideExistingTags(value: Boolean): MedicalNerParams.this.type
Controls whether to override already learned tags when using a pretrained model to initialize the new model.
Controls whether to override already learned tags when using a pretrained model to initialize the new model. A value of
true
will override existing tags. -
def
setPo(po: Float): MedicalNerParams.this.type
Learning rate decay coefficient.
Learning rate decay coefficient. Real Learning Rage = lr / (1 + po * epoch)
-
def
setPretrainedModelPath(path: String): MedicalNerParams.this.type
Set the location of an already trained MedicalNerModel, which is used as a starting point for training the new model.
-
def
setRandomValidationSplitPerEpoch(value: Boolean): MedicalNerParams.this.type
Do a random validation split after each epoch rather than at the beginning of training only.
-
def
setSentenceTokenIndex(value: Boolean): MedicalNerParams.this.type
whether to include the token index for each sentence in annotation metadata, by default false.
whether to include the token index for each sentence in annotation metadata, by default false. If the value is true, the process might be slowed down.
-
def
setTagsMapping(mapping: Map[String, String]): MedicalNerParams.this.type
A map specifying how old tags are mapped to new ones.
A map specifying how old tags are mapped to new ones. Maps are specified either using a list of comma separated strings, e.g. ("OLDTAG1,NEWTAG1", "OLDTAG2,NEWTAG2", ...) or by a Map data structure.
- def setTagsMapping(mapping: ArrayList[String]): MedicalNerParams.this.type
-
def
setTagsMapping(mapping: Array[String]): MedicalNerParams.this.type
A map specifying how old tags are mapped to new ones.
A map specifying how old tags are mapped to new ones. Maps are specified either using a list of comma separated strings, e.g. ("OLDTAG1,NEWTAG1", "OLDTAG2,NEWTAG2", ...) or by a Map data structure. It only works if setOverrideExistingTags is false.
- def setUseBestModel(value: Boolean): MedicalNerParams.this.type
-
def
setUseContrib(value: Boolean): MedicalNerParams.this.type
Whether to use contrib LSTM Cells.
Whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy.
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
val
tagsMapping: MapFeature[String, String]
A map specifying how old tags are mapped to new ones.
A map specifying how old tags are mapped to new ones.
It only works if overrideExistingTags is set to false.
-
def
toString(): String
- Definition Classes
- Identifiable → AnyRef → Any
-
val
useBestModel: BooleanParam
Whether to restore and use the model from the epoch that has achieved the best performance at the end of the training.
Whether to restore and use the model from the epoch that has achieved the best performance at the end of the training.
By default false (keep the model from the last trained epoch).
The best model depends on the earlyStoppingCriterion, which can be F1-score on test/validation dataset or the value of loss.
-
val
useContrib: BooleanParam
whether to use contrib LSTM Cells.
whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy. By default true.
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()