class MedicalNerApproach extends AnnotatorApproach[MedicalNerModel] with MedicalNerParams with NerApproach[MedicalNerApproach] with Logging with ParamsAndFeaturesWritable with EvaluationDLParams with CheckLicense

Trains generic NER models based on Neural Networks.

The architecture of the neural network is a Char CNNs - BiLSTM - CRF that achieves state-of-the-art in most datasets. For instantiated/pretrained models, see MedicalNerModel

The training data should be a labeled Spark Dataset, in the CoNLL 2003 IOB format with Annotation type columns. The data should have columns of type DOCUMENT, TOKEN, WORD_EMBEDDINGS and an additional label column of annotator type NAMED_ENTITY.

Excluding the label, this can be done with, for example, the annotators SentenceDetector, Tokenizer, and WordEmbeddingsModel (any embeddings can be chosen, e.g. BertEmbeddings for BERT based embeddings).

For extended examples of usage, see the Spark NLP Workshop.

Notes

Both DocumentAssembler and SentenceDetector annotators are annotators that output the DOCUMENT annotation type. Thus, any of them can be used as the first annotators in a pipeline.

Example

First extract the prerequisites for the MedicalNerApproach

val document = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")
val sentenceDetector = new SentenceDetector()
  .setInputCols("document")
  .setOutputCol("sentence")
val tokenizer = new Tokenizer()
  .setInputCols("sentence")
  .setOutputCol("token")
val embeddings = BertEmbeddings.pretrained()
  .setInputCols("sentence", "token")
  .setOutputCol("embeddings")

Then define the NER annotator

val nerTagger = new MedicalNerApproach()
  .setInputCols("sentence", "token", "embeddings")
  .setLabelColumn("label")
  .setOutputCol("ner")
  .setMaxEpochs(10)
  .setLr(0.005f)
  .setPo(0.005f)
  .setBatchSize(32)
  .setValidationSplit(0.1f)

Then the training can start

val pipeline = new Pipeline().setStages(Array(
  document,
  sentenceDetector,
  tokenizer,
  embeddings,
  nerTagger
))

trainingData = conll.readDataset(spark, "path/to/train_data.conll")
pipelineModel = pipeline.fit(trainingData)
Linear Supertypes
CheckLicense, EvaluationDLParams, ParamsAndFeaturesWritable, Logging, NerApproach[MedicalNerApproach], MedicalNerParams, HasFeatures, AnnotatorApproach[MedicalNerModel], CanBeLazy, DefaultParamsWritable, MLWritable, HasOutputAnnotatorType, HasOutputAnnotationCol, HasInputAnnotationCols, Estimator[MedicalNerModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. MedicalNerApproach
  2. CheckLicense
  3. EvaluationDLParams
  4. ParamsAndFeaturesWritable
  5. Logging
  6. NerApproach
  7. MedicalNerParams
  8. HasFeatures
  9. AnnotatorApproach
  10. CanBeLazy
  11. DefaultParamsWritable
  12. MLWritable
  13. HasOutputAnnotatorType
  14. HasOutputAnnotationCol
  15. HasInputAnnotationCols
  16. Estimator
  17. PipelineStage
  18. Logging
  19. Params
  20. Serializable
  21. Serializable
  22. Identifiable
  23. AnyRef
  24. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new MedicalNerApproach()
  2. new MedicalNerApproach(uid: String)

    uid

    a unique identifier for the instantiated AnnotatorModel

Type Members

  1. type AnnotatorType = String
    Definition Classes
    HasOutputAnnotatorType

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. def $$[T](feature: StructFeature[T]): T
    Attributes
    protected
    Definition Classes
    HasFeatures
  5. def $$[K, V](feature: MapFeature[K, V]): Map[K, V]
    Attributes
    protected
    Definition Classes
    HasFeatures
  6. def $$[T](feature: SetFeature[T]): Set[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  7. def $$[T](feature: ArrayFeature[T]): Array[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  8. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def _fit(dataset: Dataset[_], recursiveStages: Option[PipelineModel]): MedicalNerModel
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  10. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  11. val batchSize: IntParam

    Batch size, by default 8.

  12. def beforeTraining(spark: SparkSession): Unit
    Definition Classes
    MedicalNerApproach → AnnotatorApproach
  13. def calculateEmbeddingsDim(sentences: Seq[WordpieceEmbeddingsSentence]): Int
  14. final def checkSchema(schema: StructType, inputAnnotatorType: String): Boolean
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  15. def checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String]): Unit
    Definition Classes
    CheckLicense
  16. def checkValidScope(scope: String): Unit
    Definition Classes
    CheckLicense
  17. def checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  18. def checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  19. final def clear(param: Param[_]): MedicalNerApproach.this.type
    Definition Classes
    Params
  20. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  21. val configProtoBytes: IntArrayParam

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

    Definition Classes
    MedicalNerParams
  22. final def copy(extra: ParamMap): Estimator[MedicalNerModel]
    Definition Classes
    AnnotatorApproach → Estimator → PipelineStage → Params
  23. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  24. val datasetInfo: Param[String]

    Descriptive information about the dataset being used.

    Descriptive information about the dataset being used.

    Definition Classes
    MedicalNerParams
  25. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  26. val description: String

    Trains Tensorflow based Char-CNN-BLSTM model

    Trains Tensorflow based Char-CNN-BLSTM model

    Definition Classes
    MedicalNerApproach → AnnotatorApproach
  27. val dropout: FloatParam

    Dropout coefficient, by default 0.5.

    Dropout coefficient, by default 0.5.

    The coefficient of the dropout layer. The value should be between 0.0 and 1.0. Internally, it is used by Tensorflow as: rate = 1.0 - dropout when adding a dropout layer on top of the recurrent layers.

    Definition Classes
    MedicalNerParams
  28. val earlyStoppingCriterion: FloatParam

    If set, this param specifies the criterion to stop training if performance is not improving.

    If set, this param specifies the criterion to stop training if performance is not improving.

    Default value is 0 which is means that early stopping is not used.

    The criterion is set to F1-score if the validationSplit is greater than 0.0 (F1-socre on validation set) or testDataset is defined (F1-score on test set), otherwise it is set to model loss. The priority is as follows: - If testDataset is defined, then the criterion is set to F1-score on test set. - If validationSplit is greater than 0.0, then the criterion is set to F1-score on validation set. - Otherwise, the criterion is set to model loss.

    Note that while the F1-score ranges from 0.0 to 1.0, the loss ranges from 0.0 to infinity. So, depending on which case you are in, the value you use for the criterion can be very different. For example, if validationSplit is 0.1, then a criterion of 0.01 means that if the F1-score on the validation set difference from last epoch is greater than 0.01, then the training should stop. However, if there is not validation or test set defined, then a criterion of 2.0 means that if the loss difference between the last epoch and the current one is less than 2.0, then training should stop.

    Definition Classes
    MedicalNerParams
    See also

    earlyStoppingPatience.

  29. val earlyStoppingPatience: IntParam

    Number of epochs to wait before early stopping if no improvement, by default 5.

    Number of epochs to wait before early stopping if no improvement, by default 5.

    Given the earlyStoppingCriterion, if the performance does not improve for the given number of epochs, then the training will stop. If the value is 0, then early stopping will occurs as soon as the criterion is met (no patience).

    Definition Classes
    MedicalNerParams
    See also

    earlyStoppingCriterion.

  30. val enableMemoryOptimizer: BooleanParam

    Whether to optimize for large datasets or not.

    Whether to optimize for large datasets or not. Enabling this option can slow down training.

    In practice, if set to true the training will iterate over the spark Data Frame and retrieve the batches from the Data Frame iterator. This can be slower than the default option as it has to collect the batches on evey bach for every epoch, but it can be useful if the dataset is too large to fit in memory.

    It controls if we want the features collected and generated at once and then feed into the network batch by batch (False) or collected and generated by batch and then feed into the network in batches (True) .

    If the training data can fit to memory, then it is recommended to set this option to False (default value).

    Definition Classes
    MedicalNerParams
  31. val enableOutputLogs: BooleanParam
    Definition Classes
    EvaluationDLParams
  32. val entities: StringArrayParam
    Definition Classes
    NerApproach
  33. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  34. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  35. val evaluationLogExtended: BooleanParam
    Definition Classes
    EvaluationDLParams
  36. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  37. def explainParams(): String
    Definition Classes
    Params
  38. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  39. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  40. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  41. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  42. final def fit(dataset: Dataset[_]): MedicalNerModel
    Definition Classes
    AnnotatorApproach → Estimator
  43. def fit(dataset: Dataset[_], paramMaps: Seq[ParamMap]): Seq[MedicalNerModel]
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  44. def fit(dataset: Dataset[_], paramMap: ParamMap): MedicalNerModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  45. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): MedicalNerModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  46. def get[T](feature: StructFeature[T]): Option[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  47. def get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  48. def get[T](feature: SetFeature[T]): Option[Set[T]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  49. def get[T](feature: ArrayFeature[T]): Option[Array[T]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  50. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  51. def getBatchSize: Int

    Batch size

  52. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  53. def getConfigProtoBytes: Option[Array[Byte]]

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

    Definition Classes
    MedicalNerParams
  54. def getDatasetInfo: String

    get descriptive information about the dataset being used

    get descriptive information about the dataset being used

    Definition Classes
    MedicalNerParams
  55. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  56. def getDropout: Float

    Dropout coefficient

    Dropout coefficient

    Definition Classes
    MedicalNerParams
  57. def getEarlyStoppingCriterion: Float

    Early stopping criterion

    Early stopping criterion

    Definition Classes
    MedicalNerParams
  58. def getEarlyStoppingPatience: Int

    Early stopping patience

    Early stopping patience

    Definition Classes
    MedicalNerParams
  59. def getEnableMemoryOptimizer: Boolean

    Whether to optimize for large datasets or not.

    Whether to optimize for large datasets or not. Enabling this option can slow down training.

    Definition Classes
    MedicalNerParams
  60. def getEnableOutputLogs: Boolean
    Definition Classes
    EvaluationDLParams
  61. def getIncludeAllConfidenceScores: Boolean

    whether to include all confidence scores in annotation metadata or just the score of the predicted tag

    whether to include all confidence scores in annotation metadata or just the score of the predicted tag

    Definition Classes
    MedicalNerParams
  62. def getIncludeConfidence: Boolean

    whether to include confidence scores in annotation metadata

    whether to include confidence scores in annotation metadata

    Definition Classes
    MedicalNerParams
  63. def getInputCols: Array[String]
    Definition Classes
    HasInputAnnotationCols
  64. def getLazyAnnotator: Boolean
    Definition Classes
    CanBeLazy
  65. def getLogName: String
    Definition Classes
    MedicalNerApproach → Logging
  66. def getLr: Float

    Learning Rate

    Learning Rate

    Definition Classes
    MedicalNerParams
  67. def getMaxEpochs: Int
    Definition Classes
    NerApproach
  68. def getMinEpochs: Int
    Definition Classes
    NerApproach
  69. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  70. final def getOutputCol: String
    Definition Classes
    HasOutputAnnotationCol
  71. def getOutputLogsPath: String
    Definition Classes
    EvaluationDLParams
  72. def getOverrideExistingTags: Boolean

    Whether to override already learned tags when using a pretrained model to initialize the new model.

    Whether to override already learned tags when using a pretrained model to initialize the new model.

    Definition Classes
    MedicalNerParams
  73. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  74. def getPo: Float

    Learning rate decay coefficient.

    Learning rate decay coefficient. Real Learning Rage = lr / (1 + po * epoch)

    Definition Classes
    MedicalNerParams
  75. def getRandomSeed: Int
    Definition Classes
    NerApproach
  76. def getRandomValidationSplitPerEpoch: Boolean

    Checks if a random validation split is done after each epoch or at the beginning of training only.

    Checks if a random validation split is done after each epoch or at the beginning of training only.

    Definition Classes
    MedicalNerParams
  77. def getSentenceTokenIndex: Boolean

    whether to include the token index for each sentence in annotation metadata.

    whether to include the token index for each sentence in annotation metadata.

    Definition Classes
    MedicalNerParams
  78. def getUseBestModel: Boolean

    useBestModel

    useBestModel

    Definition Classes
    MedicalNerParams
  79. def getUseContrib: Boolean

    Whether to use contrib LSTM Cells.

    Whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy.

    Definition Classes
    MedicalNerParams
  80. def getValidationSplit: Float
    Definition Classes
    EvaluationDLParams
  81. val graphFile: Param[String]

    Path that contains the external graph file.

    Path that contains the external graph file.

    When specified, the provided file will be used, and no graph search will happen. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).

    Definition Classes
    MedicalNerParams
  82. val graphFolder: Param[String]

    Folder path that contains external graph files.

    Folder path that contains external graph files.

    The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).

    When instantiating the Tensorflow model, uses this folder to search for the adequate Tensorflow graph. The search is done using the name of the .pb file, which should be in this format: blstn_{ntags}_{embedding_dim}_{lstm_size}_{nchars}.pb.

    Then, the search follows these rules: - Embedding dimension should be exactly the same as the one used to train the model. - Number of unique tags should be greater than or equal to the number of unique tags in the training data. - Number of unique chars should be greater than or equal to the number of unique chars in the training data.

    The returned file will be the first one that satisfies all the conditions.

    If the name of the file is ill-formed, errors will occur during training.

    Definition Classes
    MedicalNerParams
  83. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  84. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  85. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  86. val includeAllConfidenceScores: BooleanParam

    Whether to include confidence scores for all tags in annotation metadata or just the score of the predicted tag, by default False.

    Whether to include confidence scores for all tags in annotation metadata or just the score of the predicted tag, by default False.

    Needs the includeConfidence parameter to be set to true.

    Enabling this may slow down the inference speed.

    Definition Classes
    MedicalNerParams
  87. val includeConfidence: BooleanParam

    Whether to include confidence scores in annotation metadata, by default False.

    Whether to include confidence scores in annotation metadata, by default False.

    Setting this parameter to True will add the confidence score to the metadata of the NAMED_ENTITY annotation. In addition, if includeAllConfidenceScores is set to true, then the confidence scores of all the tags will be added to the metadata, otherwise only for the predicted tag (the one with maximum score).

    Definition Classes
    MedicalNerParams
  88. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  89. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  90. val inputAnnotatorTypes: Array[String]

    Input annotator types : DOCUMENT, TOKEN, WORD_EMBEDDINGS

    Input annotator types : DOCUMENT, TOKEN, WORD_EMBEDDINGS

    Definition Classes
    MedicalNerApproach → HasInputAnnotationCols
  91. final val inputCols: StringArrayParam
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  92. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  93. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  94. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  95. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  96. val labelColumn: Param[String]
    Definition Classes
    NerApproach
  97. val lazyAnnotator: BooleanParam
    Definition Classes
    CanBeLazy
  98. def log(value: ⇒ String, minLevel: Level): Unit
    Attributes
    protected
    Definition Classes
    Logging
  99. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  100. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  101. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  102. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  103. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  104. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  105. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  106. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  107. val logPrefix: Param[String]

    A prefix that will be appended to every log, default value is empty.

    A prefix that will be appended to every log, default value is empty.

    Definition Classes
    MedicalNerParams
  108. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  109. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  110. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  111. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  112. val logger: Logger
    Attributes
    protected
    Definition Classes
    Logging
  113. val lr: FloatParam

    Learning Rate, by default 0.001.

    Learning Rate, by default 0.001.

    Definition Classes
    MedicalNerParams
  114. val maxEpochs: IntParam
    Definition Classes
    NerApproach
  115. val minEpochs: IntParam
    Definition Classes
    NerApproach
  116. def msgHelper(schema: StructType): String
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  117. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  118. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  119. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  120. def onTrained(model: MedicalNerModel, spark: SparkSession): Unit
    Definition Classes
    AnnotatorApproach
  121. def onWrite(path: String, spark: SparkSession): Unit
    Attributes
    protected
    Definition Classes
    ParamsAndFeaturesWritable
  122. val optionalInputAnnotatorTypes: Array[String]
    Definition Classes
    HasInputAnnotationCols
  123. val outputAnnotatorType: String

    Input annotator types : NAMED_ENTITY

    Input annotator types : NAMED_ENTITY

    Definition Classes
    MedicalNerApproach → HasOutputAnnotatorType
  124. final val outputCol: Param[String]
    Attributes
    protected
    Definition Classes
    HasOutputAnnotationCol
  125. def outputLog(value: ⇒ String, uuid: String, shouldLog: Boolean, outputLogsPath: String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  126. val outputLogsPath: Param[String]
    Definition Classes
    EvaluationDLParams
  127. val overrideExistingTags: BooleanParam

    Controls whether to override already learned tags when using a pretrained model to initialize the new model.

    Controls whether to override already learned tags when using a pretrained model to initialize the new model. A value of true will override existing tags.

    Definition Classes
    MedicalNerParams
  128. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  129. val po: FloatParam

    Learning rate decay coefficient (time-based).

    Learning rate decay coefficient (time-based).

    This is used to calculate the decayed learning rate at each step as: lr = lr / (1 + po * epoch), meaning that the value of the learning rate is updated on each epoch. By default 0.005.

    Definition Classes
    MedicalNerParams
  130. val pretrainedModelPath: Param[String]

    Path to an already trained MedicalNerModel.

    Path to an already trained MedicalNerModel.

    This pretrained model will be used as a starting point for training the new one. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).

    Definition Classes
    MedicalNerParams
  131. val randomSeed: IntParam
    Definition Classes
    NerApproach
  132. val randomValidationSplitPerEpoch: BooleanParam

    Do a random validation split after each epoch rather than at the beginning of training only.

    Do a random validation split after each epoch rather than at the beginning of training only.

    Definition Classes
    MedicalNerParams
  133. def resumeTrainingFromModel(model: MedicalNerModel): MedicalNerApproach.this.type
  134. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  135. val sentenceTokenIndex: BooleanParam

    whether to include the token index for each sentence in annotation metadata, by default false.

    whether to include the token index for each sentence in annotation metadata, by default false. If the value is true, the process might be slowed down.

    Definition Classes
    MedicalNerParams
  136. def set[T](feature: StructFeature[T], value: T): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  137. def set[K, V](feature: MapFeature[K, V], value: Map[K, V]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  138. def set[T](feature: SetFeature[T], value: Set[T]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  139. def set[T](feature: ArrayFeature[T], value: Array[T]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  140. final def set(paramPair: ParamPair[_]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    Params
  141. final def set(param: String, value: Any): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    Params
  142. final def set[T](param: Param[T], value: T): MedicalNerApproach.this.type
    Definition Classes
    Params
  143. def setBatchSize(batch: Int): MedicalNerApproach.this.type

    Batch size

  144. def setConfigProtoBytes(bytes: Array[Int]): MedicalNerApproach.this.type

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

    Definition Classes
    MedicalNerParams
  145. def setDatasetInfo(value: String): MedicalNerApproach.this.type

    set descriptive information about the dataset being used

    set descriptive information about the dataset being used

    Definition Classes
    MedicalNerParams
  146. def setDefault[T](feature: StructFeature[T], value: () ⇒ T): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  147. def setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  148. def setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  149. def setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  150. final def setDefault(paramPairs: ParamPair[_]*): MedicalNerApproach.this.type
    Attributes
    protected
    Definition Classes
    Params
  151. final def setDefault[T](param: Param[T], value: T): MedicalNerApproach.this.type
    Attributes
    protected[org.apache.spark.ml]
    Definition Classes
    Params
  152. def setDropout(dropout: Float): MedicalNerApproach.this.type

    Dropout coefficient

    Dropout coefficient

    Definition Classes
    MedicalNerParams
  153. def setEarlyStoppingCriterion(value: Float): MedicalNerApproach.this.type

    Definition Classes
    MedicalNerParams
  154. def setEarlyStoppingPatience(value: Int): MedicalNerApproach.this.type

    Definition Classes
    MedicalNerParams
  155. def setEnableMemoryOptimizer(value: Boolean): MedicalNerApproach.this.type
    Definition Classes
    MedicalNerParams
  156. def setEnableOutputLogs(enableOutputLogs: Boolean): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  157. def setEntities(tags: Array[String]): MedicalNerApproach
    Definition Classes
    NerApproach
  158. def setEvaluationLogExtended(evaluationLogExtended: Boolean): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  159. def setGraphFile(path: String): MedicalNerApproach.this.type

    Folder path that contain external graph files

    Folder path that contain external graph files

    Definition Classes
    MedicalNerParams
  160. def setGraphFolder(path: String): MedicalNerApproach.this.type

    Folder path that contain external graph files

    Folder path that contain external graph files

    Definition Classes
    MedicalNerParams
  161. def setIncludeAllConfidenceScores(value: Boolean): MedicalNerApproach.this.type

    Whether to include confidence scores in annotation metadata

    Whether to include confidence scores in annotation metadata

    Definition Classes
    MedicalNerParams
  162. def setIncludeConfidence(value: Boolean): MedicalNerApproach.this.type

    Whether to include confidence scores for all tags rather than just for the predicted one

    Whether to include confidence scores for all tags rather than just for the predicted one

    Definition Classes
    MedicalNerParams
  163. final def setInputCols(value: String*): MedicalNerApproach.this.type
    Definition Classes
    HasInputAnnotationCols
  164. def setInputCols(value: Array[String]): MedicalNerApproach.this.type
    Definition Classes
    HasInputAnnotationCols
  165. def setLabelColumn(column: String): MedicalNerApproach
    Definition Classes
    NerApproach
  166. def setLazyAnnotator(value: Boolean): MedicalNerApproach.this.type
    Definition Classes
    CanBeLazy
  167. def setLogPrefix(value: String): MedicalNerApproach.this.type

    a string prefix to be included in the logs

    a string prefix to be included in the logs

    Definition Classes
    MedicalNerParams
  168. def setLr(lr: Float): MedicalNerApproach.this.type

    Learning Rate

    Learning Rate

    Definition Classes
    MedicalNerParams
  169. def setMaxEpochs(epochs: Int): MedicalNerApproach
    Definition Classes
    NerApproach
  170. def setMinEpochs(epochs: Int): MedicalNerApproach
    Definition Classes
    NerApproach
  171. final def setOutputCol(value: String): MedicalNerApproach.this.type
    Definition Classes
    HasOutputAnnotationCol
  172. def setOutputLogsPath(path: String): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  173. def setOverrideExistingTags(value: Boolean): MedicalNerApproach.this.type

    Controls whether to override already learned tags when using a pretrained model to initialize the new model.

    Controls whether to override already learned tags when using a pretrained model to initialize the new model. A value of true will override existing tags.

    Definition Classes
    MedicalNerParams
  174. def setPo(po: Float): MedicalNerApproach.this.type

    Learning rate decay coefficient.

    Learning rate decay coefficient. Real Learning Rage = lr / (1 + po * epoch)

    Definition Classes
    MedicalNerParams
  175. def setPretrainedModelPath(path: String): MedicalNerApproach.this.type

    Set the location of an already trained MedicalNerModel, which is used as a starting point for training the new model.

    Set the location of an already trained MedicalNerModel, which is used as a starting point for training the new model.

    Definition Classes
    MedicalNerParams
  176. def setRandomSeed(seed: Int): MedicalNerApproach
    Definition Classes
    NerApproach
  177. def setRandomValidationSplitPerEpoch(value: Boolean): MedicalNerApproach.this.type

    Do a random validation split after each epoch rather than at the beginning of training only.

    Do a random validation split after each epoch rather than at the beginning of training only.

    Definition Classes
    MedicalNerParams
  178. def setSentenceTokenIndex(value: Boolean): MedicalNerApproach.this.type

    whether to include the token index for each sentence in annotation metadata, by default false.

    whether to include the token index for each sentence in annotation metadata, by default false. If the value is true, the process might be slowed down.

    Definition Classes
    MedicalNerParams
  179. def setTagsMapping(mapping: Map[String, String]): MedicalNerApproach.this.type

    A map specifying how old tags are mapped to new ones.

    A map specifying how old tags are mapped to new ones. Maps are specified either using a list of comma separated strings, e.g. ("OLDTAG1,NEWTAG1", "OLDTAG2,NEWTAG2", ...) or by a Map data structure.

    Definition Classes
    MedicalNerParams
  180. def setTagsMapping(mapping: ArrayList[String]): MedicalNerApproach.this.type
    Definition Classes
    MedicalNerParams
  181. def setTagsMapping(mapping: Array[String]): MedicalNerApproach.this.type

    A map specifying how old tags are mapped to new ones.

    A map specifying how old tags are mapped to new ones. Maps are specified either using a list of comma separated strings, e.g. ("OLDTAG1,NEWTAG1", "OLDTAG2,NEWTAG2", ...) or by a Map data structure. It only works if setOverrideExistingTags is false.

    Definition Classes
    MedicalNerParams
  182. def setTestDataset(er: ExternalResource): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  183. def setTestDataset(path: String, readAs: Format, options: Map[String, String]): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  184. def setUseBestModel(value: Boolean): MedicalNerApproach.this.type

    Definition Classes
    MedicalNerParams
  185. def setUseContrib(value: Boolean): MedicalNerApproach.this.type

    Whether to use contrib LSTM Cells.

    Whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy.

    Definition Classes
    MedicalNerParams
  186. def setValidationSplit(validationSplit: Float): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  187. def setVerbose(verbose: Level): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  188. def setVerbose(verbose: Int): MedicalNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  189. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  190. val tagsMapping: MapFeature[String, String]

    A map specifying how old tags are mapped to new ones.

    A map specifying how old tags are mapped to new ones.

    It only works if overrideExistingTags is set to false.

    Definition Classes
    MedicalNerParams
  191. val testDataset: ExternalResourceParam
    Definition Classes
    EvaluationDLParams
  192. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  193. def train(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): MedicalNerModel
    Definition Classes
    MedicalNerApproach → AnnotatorApproach
  194. final def transformSchema(schema: StructType): StructType
    Definition Classes
    AnnotatorApproach → PipelineStage
  195. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  196. val uid: String
    Definition Classes
    MedicalNerApproach → Identifiable
  197. val useBestModel: BooleanParam

    Whether to restore and use the model from the epoch that has achieved the best performance at the end of the training.

    Whether to restore and use the model from the epoch that has achieved the best performance at the end of the training.

    By default false (keep the model from the last trained epoch).

    The best model depends on the earlyStoppingCriterion, which can be F1-score on test/validation dataset or the value of loss.

    Definition Classes
    MedicalNerParams
  198. val useContrib: BooleanParam

    whether to use contrib LSTM Cells.

    whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy. By default true.

    Definition Classes
    MedicalNerParams
  199. def validate(schema: StructType): Boolean
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  200. val validationSplit: FloatParam
    Definition Classes
    EvaluationDLParams
  201. val verbose: IntParam
    Definition Classes
    EvaluationDLParams
  202. val verboseLevel: Level
    Definition Classes
    MedicalNerApproach → Logging
  203. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  204. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  205. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  206. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable

Inherited from CheckLicense

Inherited from EvaluationDLParams

Inherited from ParamsAndFeaturesWritable

Inherited from Logging

Inherited from NerApproach[MedicalNerApproach]

Inherited from MedicalNerParams

Inherited from HasFeatures

Inherited from AnnotatorApproach[MedicalNerModel]

Inherited from CanBeLazy

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from HasOutputAnnotatorType

Inherited from HasOutputAnnotationCol

Inherited from HasInputAnnotationCols

Inherited from Estimator[MedicalNerModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

Annotator types

Required input and expected output annotator types

Members

Parameter setters

Parameter getters