class FinanceNerApproach extends MedicalNerApproach

Trains generic NER models based on Neural Networks.

The architecture of the neural network is a Char CNNs - BiLSTM - CRF that achieves state-of-the-art in most datasets.

For instantiated/pretrained models, see FinanceNerModel

The training data should be a labeled Spark Dataset, in the CoNLL 2003 IOB format with Annotation type columns. The data should have columns of type DOCUMENT, TOKEN, WORD_EMBEDDINGS and an additional label column of annotator type NAMED_ENTITY.

Excluding the label, this can be done with, for example, the annotators SentenceDetector, Tokenizer, and WordEmbeddingsModel (any embeddings can be chosen, e.g. BertEmbeddings for BERT based embeddings).

For extended examples of usage, see the Spark NLP Workshop.

Notes

Both DocumentAssembler and SentenceDetector annotators are annotators that output the DOCUMENT annotation type. Thus, any of them can be used as the first annotators in a pipeline.

Example

First extract the prerequisites for the FinanceNerApproach

val document = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")
val sentenceDetector = new SentenceDetector()
  .setInputCols("document")
  .setOutputCol("sentence")
val tokenizer = new Tokenizer()
  .setInputCols("sentence")
  .setOutputCol("token")
val embeddings = BertEmbeddings.pretrained()
  .setInputCols("sentence", "token")
  .setOutputCol("embeddings")

Then define the NER annotator

val nerTagger = new FinanceNerApproach()
  .setInputCols("sentence", "token", "embeddings")
  .setLabelColumn("label")
  .setOutputCol("ner")
  .setMaxEpochs(10)
  .setLr(0.005f)
  .setPo(0.005f)
  .setBatchSize(32)
  .setValidationSplit(0.1f)

Then the training can start

val pipeline = new Pipeline().setStages(Array(
  document,
  sentenceDetector,
  tokenizer,
  embeddings,
  nerTagger
))

trainingData = conll.readDataset(spark, "path/to/train_data.conll")
pipelineModel = pipeline.fit(trainingData)
Linear Supertypes
MedicalNerApproach, CheckLicense, EvaluationDLParams, ParamsAndFeaturesWritable, Logging, NerApproach[MedicalNerApproach], MedicalNerParams, HasFeatures, AnnotatorApproach[MedicalNerModel], CanBeLazy, DefaultParamsWritable, MLWritable, HasOutputAnnotatorType, HasOutputAnnotationCol, HasInputAnnotationCols, Estimator[MedicalNerModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. FinanceNerApproach
  2. MedicalNerApproach
  3. CheckLicense
  4. EvaluationDLParams
  5. ParamsAndFeaturesWritable
  6. Logging
  7. NerApproach
  8. MedicalNerParams
  9. HasFeatures
  10. AnnotatorApproach
  11. CanBeLazy
  12. DefaultParamsWritable
  13. MLWritable
  14. HasOutputAnnotatorType
  15. HasOutputAnnotationCol
  16. HasInputAnnotationCols
  17. Estimator
  18. PipelineStage
  19. Logging
  20. Params
  21. Serializable
  22. Serializable
  23. Identifiable
  24. AnyRef
  25. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Parameters

  1. val batchSize: IntParam

    Batch size, by default 8.

    Batch size, by default 8.

    Definition Classes
    MedicalNerApproach
  2. val configProtoBytes: IntArrayParam

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

    Definition Classes
    MedicalNerParams
  3. val datasetInfo: Param[String]

    Descriptive information about the dataset being used.

    Descriptive information about the dataset being used.

    Definition Classes
    MedicalNerParams
  4. val dropout: FloatParam

    Dropout coefficient, by default 0.5.

    Dropout coefficient, by default 0.5.

    The coefficient of the dropout layer. The value should be between 0.0 and 1.0. Internally, it is used by Tensorflow as: rate = 1.0 - dropout when adding a dropout layer on top of the recurrent layers.

    Definition Classes
    MedicalNerParams
  5. val earlyStoppingCriterion: FloatParam

    If set, this param specifies the criterion to stop training if performance is not improving.

    If set, this param specifies the criterion to stop training if performance is not improving.

    Default value is 0 which is means that early stopping is not used.

    The criterion is set to F1-score if the validationSplit is greater than 0.0 (F1-socre on validation set) or testDataset is defined (F1-score on test set), otherwise it is set to model loss. The priority is as follows: - If testDataset is defined, then the criterion is set to F1-score on test set. - If validationSplit is greater than 0.0, then the criterion is set to F1-score on validation set. - Otherwise, the criterion is set to model loss.

    Note that while the F1-score ranges from 0.0 to 1.0, the loss ranges from 0.0 to infinity. So, depending on which case you are in, the value you use for the criterion can be very different. For example, if validationSplit is 0.1, then a criterion of 0.01 means that if the F1-score on the validation set difference from last epoch is greater than 0.01, then the training should stop. However, if there is not validation or test set defined, then a criterion of 2.0 means that if the loss difference between the last epoch and the current one is less than 2.0, then training should stop.

    Definition Classes
    MedicalNerParams
    See also

    earlyStoppingPatience.

  6. val earlyStoppingPatience: IntParam

    Number of epochs to wait before early stopping if no improvement, by default 5.

    Number of epochs to wait before early stopping if no improvement, by default 5.

    Given the earlyStoppingCriterion, if the performance does not improve for the given number of epochs, then the training will stop. If the value is 0, then early stopping will occurs as soon as the criterion is met (no patience).

    Definition Classes
    MedicalNerParams
    See also

    earlyStoppingCriterion.

  7. val enableMemoryOptimizer: BooleanParam

    Whether to optimize for large datasets or not.

    Whether to optimize for large datasets or not. Enabling this option can slow down training.

    In practice, if set to true the training will iterate over the spark Data Frame and retrieve the batches from the Data Frame iterator. This can be slower than the default option as it has to collect the batches on evey bach for every epoch, but it can be useful if the dataset is too large to fit in memory.

    It controls if we want the features collected and generated at once and then feed into the network batch by batch (False) or collected and generated by batch and then feed into the network in batches (True) .

    If the training data can fit to memory, then it is recommended to set this option to False (default value).

    Definition Classes
    MedicalNerParams
  8. val graphFile: Param[String]

    Path that contains the external graph file.

    Path that contains the external graph file.

    When specified, the provided file will be used, and no graph search will happen. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).

    Definition Classes
    MedicalNerParams
  9. val graphFolder: Param[String]

    Folder path that contains external graph files.

    Folder path that contains external graph files.

    The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).

    When instantiating the Tensorflow model, uses this folder to search for the adequate Tensorflow graph. The search is done using the name of the .pb file, which should be in this format: blstn_{ntags}_{embedding_dim}_{lstm_size}_{nchars}.pb.

    Then, the search follows these rules: - Embedding dimension should be exactly the same as the one used to train the model. - Number of unique tags should be greater than or equal to the number of unique tags in the training data. - Number of unique chars should be greater than or equal to the number of unique chars in the training data.

    The returned file will be the first one that satisfies all the conditions.

    If the name of the file is ill-formed, errors will occur during training.

    Definition Classes
    MedicalNerParams
  10. val includeAllConfidenceScores: BooleanParam

    Whether to include confidence scores for all tags in annotation metadata or just the score of the predicted tag, by default False.

    Whether to include confidence scores for all tags in annotation metadata or just the score of the predicted tag, by default False.

    Needs the includeConfidence parameter to be set to true.

    Enabling this may slow down the inference speed.

    Definition Classes
    MedicalNerParams
  11. val includeConfidence: BooleanParam

    Whether to include confidence scores in annotation metadata, by default False.

    Whether to include confidence scores in annotation metadata, by default False.

    Setting this parameter to True will add the confidence score to the metadata of the NAMED_ENTITY annotation. In addition, if includeAllConfidenceScores is set to true, then the confidence scores of all the tags will be added to the metadata, otherwise only for the predicted tag (the one with maximum score).

    Definition Classes
    MedicalNerParams
  12. val logPrefix: Param[String]

    A prefix that will be appended to every log, default value is empty.

    A prefix that will be appended to every log, default value is empty.

    Definition Classes
    MedicalNerParams
  13. val lr: FloatParam

    Learning Rate, by default 0.001.

    Learning Rate, by default 0.001.

    Definition Classes
    MedicalNerParams
  14. val overrideExistingTags: BooleanParam

    Controls whether to override already learned tags when using a pretrained model to initialize the new model.

    Controls whether to override already learned tags when using a pretrained model to initialize the new model. A value of true will override existing tags.

    Definition Classes
    MedicalNerParams
  15. val po: FloatParam

    Learning rate decay coefficient (time-based).

    Learning rate decay coefficient (time-based).

    This is used to calculate the decayed learning rate at each step as: lr = lr / (1 + po * epoch), meaning that the value of the learning rate is updated on each epoch. By default 0.005.

    Definition Classes
    MedicalNerParams
  16. val pretrainedModelPath: Param[String]

    Path to an already trained MedicalNerModel.

    Path to an already trained MedicalNerModel.

    This pretrained model will be used as a starting point for training the new one. The path can be a local file path, a distributed file path (HDFS, DBFS), or a cloud storage (S3).

    Definition Classes
    MedicalNerParams
  17. val randomValidationSplitPerEpoch: BooleanParam

    Do a random validation split after each epoch rather than at the beginning of training only.

    Do a random validation split after each epoch rather than at the beginning of training only.

    Definition Classes
    MedicalNerParams
  18. val sentenceTokenIndex: BooleanParam

    whether to include the token index for each sentence in annotation metadata, by default false.

    whether to include the token index for each sentence in annotation metadata, by default false. If the value is true, the process might be slowed down.

    Definition Classes
    MedicalNerParams
  19. val tagsMapping: MapFeature[String, String]

    A map specifying how old tags are mapped to new ones.

    A map specifying how old tags are mapped to new ones.

    It only works if overrideExistingTags is set to false.

    Definition Classes
    MedicalNerParams
  20. val useBestModel: BooleanParam

    Whether to restore and use the model from the epoch that has achieved the best performance at the end of the training.

    Whether to restore and use the model from the epoch that has achieved the best performance at the end of the training.

    By default false (keep the model from the last trained epoch).

    The best model depends on the earlyStoppingCriterion, which can be F1-score on test/validation dataset or the value of loss.

    Definition Classes
    MedicalNerParams
  21. val useContrib: BooleanParam

    whether to use contrib LSTM Cells.

    whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy. By default true.

    Definition Classes
    MedicalNerParams

Annotator types

Required input and expected output annotator types

  1. val inputAnnotatorTypes: Array[String]

    Input annotator types : DOCUMENT, TOKEN, WORD_EMBEDDINGS

    Input annotator types : DOCUMENT, TOKEN, WORD_EMBEDDINGS

    Definition Classes
    MedicalNerApproach → HasInputAnnotationCols
  2. val outputAnnotatorType: String

    Input annotator types : NAMED_ENTITY

    Input annotator types : NAMED_ENTITY

    Definition Classes
    MedicalNerApproach → HasOutputAnnotatorType

Members

  1. type AnnotatorType = String
    Definition Classes
    HasOutputAnnotatorType
  1. def beforeTraining(spark: SparkSession): Unit
    Definition Classes
    MedicalNerApproach → AnnotatorApproach
  2. def calculateEmbeddingsDim(sentences: Seq[WordpieceEmbeddingsSentence]): Int
    Definition Classes
    MedicalNerApproach
  3. def checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String]): Unit
    Definition Classes
    CheckLicense
  4. def checkValidScope(scope: String): Unit
    Definition Classes
    CheckLicense
  5. def checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  6. def checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  7. final def clear(param: Param[_]): FinanceNerApproach.this.type
    Definition Classes
    Params
  8. final def copy(extra: ParamMap): Estimator[MedicalNerModel]
    Definition Classes
    AnnotatorApproach → Estimator → PipelineStage → Params
  9. val description: String

    Trains Tensorflow based Char-CNN-BLSTM model

    Trains Tensorflow based Char-CNN-BLSTM model

    Definition Classes
    MedicalNerApproach → AnnotatorApproach
  10. val enableOutputLogs: BooleanParam
    Definition Classes
    EvaluationDLParams
  11. val entities: StringArrayParam
    Definition Classes
    NerApproach
  12. val evaluationLogExtended: BooleanParam
    Definition Classes
    EvaluationDLParams
  13. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  14. def explainParams(): String
    Definition Classes
    Params
  15. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  16. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  17. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  18. final def fit(dataset: Dataset[_]): MedicalNerModel
    Definition Classes
    AnnotatorApproach → Estimator
  19. def fit(dataset: Dataset[_], paramMaps: Seq[ParamMap]): Seq[MedicalNerModel]
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  20. def fit(dataset: Dataset[_], paramMap: ParamMap): MedicalNerModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  21. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): MedicalNerModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  22. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  23. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  24. def getEnableOutputLogs: Boolean
    Definition Classes
    EvaluationDLParams
  25. def getInputCols: Array[String]
    Definition Classes
    HasInputAnnotationCols
  26. def getLazyAnnotator: Boolean
    Definition Classes
    CanBeLazy
  27. def getLogName: String
    Definition Classes
    MedicalNerApproach → Logging
  28. def getMaxEpochs: Int
    Definition Classes
    NerApproach
  29. def getMinEpochs: Int
    Definition Classes
    NerApproach
  30. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  31. final def getOutputCol: String
    Definition Classes
    HasOutputAnnotationCol
  32. def getOutputLogsPath: String
    Definition Classes
    EvaluationDLParams
  33. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  34. def getRandomSeed: Int
    Definition Classes
    NerApproach
  35. def getValidationSplit: Float
    Definition Classes
    EvaluationDLParams
  36. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  37. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  38. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  39. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  40. val labelColumn: Param[String]
    Definition Classes
    NerApproach
  41. val lazyAnnotator: BooleanParam
    Definition Classes
    CanBeLazy
  42. val maxEpochs: IntParam
    Definition Classes
    NerApproach
  43. val minEpochs: IntParam
    Definition Classes
    NerApproach
  44. def onTrained(model: MedicalNerModel, spark: SparkSession): Unit
    Definition Classes
    AnnotatorApproach
  45. val optionalInputAnnotatorTypes: Array[String]
    Definition Classes
    HasInputAnnotationCols
  46. val outputLogsPath: Param[String]
    Definition Classes
    EvaluationDLParams
  47. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  48. val randomSeed: IntParam
    Definition Classes
    NerApproach
  49. def resumeTrainingFromModel(model: FinanceNerApproach): FinanceNerApproach.this.type
  50. def resumeTrainingFromModel(model: MedicalNerModel): FinanceNerApproach.this.type
    Definition Classes
    MedicalNerApproach
  51. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  52. final def set[T](param: Param[T], value: T): FinanceNerApproach.this.type
    Definition Classes
    Params
  53. def setEnableMemoryOptimizer(value: Boolean): FinanceNerApproach.this.type
    Definition Classes
    MedicalNerParams
  54. def setEnableOutputLogs(enableOutputLogs: Boolean): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  55. def setEntities(tags: Array[String]): MedicalNerApproach
    Definition Classes
    NerApproach
  56. def setEvaluationLogExtended(evaluationLogExtended: Boolean): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  57. final def setInputCols(value: String*): FinanceNerApproach.this.type
    Definition Classes
    HasInputAnnotationCols
  58. def setInputCols(value: Array[String]): FinanceNerApproach.this.type
    Definition Classes
    HasInputAnnotationCols
  59. def setLabelColumn(column: String): MedicalNerApproach
    Definition Classes
    NerApproach
  60. def setLazyAnnotator(value: Boolean): FinanceNerApproach.this.type
    Definition Classes
    CanBeLazy
  61. def setMaxEpochs(epochs: Int): MedicalNerApproach
    Definition Classes
    NerApproach
  62. def setMinEpochs(epochs: Int): MedicalNerApproach
    Definition Classes
    NerApproach
  63. final def setOutputCol(value: String): FinanceNerApproach.this.type
    Definition Classes
    HasOutputAnnotationCol
  64. def setOutputLogsPath(path: String): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  65. def setRandomSeed(seed: Int): MedicalNerApproach
    Definition Classes
    NerApproach
  66. def setTagsMapping(mapping: ArrayList[String]): FinanceNerApproach.this.type
    Definition Classes
    MedicalNerParams
  67. def setTestDataset(er: ExternalResource): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  68. def setTestDataset(path: String, readAs: Format, options: Map[String, String]): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  69. def setValidationSplit(validationSplit: Float): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  70. def setVerbose(verbose: Level): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  71. def setVerbose(verbose: Int): FinanceNerApproach.this.type
    Definition Classes
    EvaluationDLParams
  72. val testDataset: ExternalResourceParam
    Definition Classes
    EvaluationDLParams
  73. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  74. def train(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): FinanceNerModel
    Definition Classes
    FinanceNerApproachMedicalNerApproach → AnnotatorApproach
  75. final def transformSchema(schema: StructType): StructType
    Definition Classes
    AnnotatorApproach → PipelineStage
  76. val uid: String
    Definition Classes
    FinanceNerApproachMedicalNerApproach → Identifiable
  77. val validationSplit: FloatParam
    Definition Classes
    EvaluationDLParams
  78. val verbose: IntParam
    Definition Classes
    EvaluationDLParams
  79. val verboseLevel: Level
    Definition Classes
    MedicalNerApproach → Logging
  80. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable

Parameter setters

  1. def getEnableMemoryOptimizer: Boolean

    Whether to optimize for large datasets or not.

    Whether to optimize for large datasets or not. Enabling this option can slow down training.

    Definition Classes
    MedicalNerParams
  2. def getOverrideExistingTags: Boolean

    Whether to override already learned tags when using a pretrained model to initialize the new model.

    Whether to override already learned tags when using a pretrained model to initialize the new model.

    Definition Classes
    MedicalNerParams
  3. def setBatchSize(batch: Int): FinanceNerApproach.this.type

    Batch size

    Batch size

    Definition Classes
    MedicalNerApproach
  4. def setConfigProtoBytes(bytes: Array[Int]): FinanceNerApproach.this.type

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

    Definition Classes
    MedicalNerParams
  5. def setDatasetInfo(value: String): FinanceNerApproach.this.type

    set descriptive information about the dataset being used

    set descriptive information about the dataset being used

    Definition Classes
    MedicalNerParams
  6. def setDropout(dropout: Float): FinanceNerApproach.this.type

    Dropout coefficient

    Dropout coefficient

    Definition Classes
    MedicalNerParams
  7. def setEarlyStoppingCriterion(value: Float): FinanceNerApproach.this.type

    Definition Classes
    MedicalNerParams
  8. def setEarlyStoppingPatience(value: Int): FinanceNerApproach.this.type

    Definition Classes
    MedicalNerParams
  9. def setGraphFile(path: String): FinanceNerApproach.this.type

    Folder path that contain external graph files

    Folder path that contain external graph files

    Definition Classes
    MedicalNerParams
  10. def setGraphFolder(path: String): FinanceNerApproach.this.type

    Folder path that contain external graph files

    Folder path that contain external graph files

    Definition Classes
    MedicalNerParams
  11. def setIncludeAllConfidenceScores(value: Boolean): FinanceNerApproach.this.type

    Whether to include confidence scores in annotation metadata

    Whether to include confidence scores in annotation metadata

    Definition Classes
    MedicalNerParams
  12. def setIncludeConfidence(value: Boolean): FinanceNerApproach.this.type

    Whether to include confidence scores for all tags rather than just for the predicted one

    Whether to include confidence scores for all tags rather than just for the predicted one

    Definition Classes
    MedicalNerParams
  13. def setLogPrefix(value: String): FinanceNerApproach.this.type

    a string prefix to be included in the logs

    a string prefix to be included in the logs

    Definition Classes
    MedicalNerParams
  14. def setLr(lr: Float): FinanceNerApproach.this.type

    Learning Rate

    Learning Rate

    Definition Classes
    MedicalNerParams
  15. def setOverrideExistingTags(value: Boolean): FinanceNerApproach.this.type

    Controls whether to override already learned tags when using a pretrained model to initialize the new model.

    Controls whether to override already learned tags when using a pretrained model to initialize the new model. A value of true will override existing tags.

    Definition Classes
    MedicalNerParams
  16. def setPo(po: Float): FinanceNerApproach.this.type

    Learning rate decay coefficient.

    Learning rate decay coefficient. Real Learning Rage = lr / (1 + po * epoch)

    Definition Classes
    MedicalNerParams
  17. def setPretrainedModelPath(path: String): FinanceNerApproach.this.type

    Set the location of an already trained MedicalNerModel, which is used as a starting point for training the new model.

    Set the location of an already trained MedicalNerModel, which is used as a starting point for training the new model.

    Definition Classes
    MedicalNerParams
  18. def setSentenceTokenIndex(value: Boolean): FinanceNerApproach.this.type

    whether to include the token index for each sentence in annotation metadata, by default false.

    whether to include the token index for each sentence in annotation metadata, by default false. If the value is true, the process might be slowed down.

    Definition Classes
    MedicalNerParams
  19. def setTagsMapping(mapping: Map[String, String]): FinanceNerApproach.this.type

    A map specifying how old tags are mapped to new ones.

    A map specifying how old tags are mapped to new ones. Maps are specified either using a list of comma separated strings, e.g. ("OLDTAG1,NEWTAG1", "OLDTAG2,NEWTAG2", ...) or by a Map data structure.

    Definition Classes
    MedicalNerParams
  20. def setTagsMapping(mapping: Array[String]): FinanceNerApproach.this.type

    A map specifying how old tags are mapped to new ones.

    A map specifying how old tags are mapped to new ones. Maps are specified either using a list of comma separated strings, e.g. ("OLDTAG1,NEWTAG1", "OLDTAG2,NEWTAG2", ...) or by a Map data structure. It only works if setOverrideExistingTags is false.

    Definition Classes
    MedicalNerParams
  21. def setUseBestModel(value: Boolean): FinanceNerApproach.this.type

    Definition Classes
    MedicalNerParams
  22. def setUseContrib(value: Boolean): FinanceNerApproach.this.type

    Whether to use contrib LSTM Cells.

    Whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy.

    Definition Classes
    MedicalNerParams

Parameter getters

  1. def getBatchSize: Int

    Batch size

    Batch size

    Definition Classes
    MedicalNerApproach
  2. def getConfigProtoBytes: Option[Array[Byte]]

    ConfigProto from tensorflow, serialized into byte array.

    ConfigProto from tensorflow, serialized into byte array. Get with config_proto.SerializeToString()

    Definition Classes
    MedicalNerParams
  3. def getDatasetInfo: String

    get descriptive information about the dataset being used

    get descriptive information about the dataset being used

    Definition Classes
    MedicalNerParams
  4. def getDropout: Float

    Dropout coefficient

    Dropout coefficient

    Definition Classes
    MedicalNerParams
  5. def getEarlyStoppingCriterion: Float

    Early stopping criterion

    Early stopping criterion

    Definition Classes
    MedicalNerParams
  6. def getEarlyStoppingPatience: Int

    Early stopping patience

    Early stopping patience

    Definition Classes
    MedicalNerParams
  7. def getIncludeAllConfidenceScores: Boolean

    whether to include all confidence scores in annotation metadata or just the score of the predicted tag

    whether to include all confidence scores in annotation metadata or just the score of the predicted tag

    Definition Classes
    MedicalNerParams
  8. def getIncludeConfidence: Boolean

    whether to include confidence scores in annotation metadata

    whether to include confidence scores in annotation metadata

    Definition Classes
    MedicalNerParams
  9. def getLr: Float

    Learning Rate

    Learning Rate

    Definition Classes
    MedicalNerParams
  10. def getPo: Float

    Learning rate decay coefficient.

    Learning rate decay coefficient. Real Learning Rage = lr / (1 + po * epoch)

    Definition Classes
    MedicalNerParams
  11. def getRandomValidationSplitPerEpoch: Boolean

    Checks if a random validation split is done after each epoch or at the beginning of training only.

    Checks if a random validation split is done after each epoch or at the beginning of training only.

    Definition Classes
    MedicalNerParams
  12. def getSentenceTokenIndex: Boolean

    whether to include the token index for each sentence in annotation metadata.

    whether to include the token index for each sentence in annotation metadata.

    Definition Classes
    MedicalNerParams
  13. def getUseBestModel: Boolean

    useBestModel

    useBestModel

    Definition Classes
    MedicalNerParams
  14. def getUseContrib: Boolean

    Whether to use contrib LSTM Cells.

    Whether to use contrib LSTM Cells. Not compatible with Windows. Might slightly improve accuracy.

    Definition Classes
    MedicalNerParams
  15. def setRandomValidationSplitPerEpoch(value: Boolean): FinanceNerApproach.this.type

    Do a random validation split after each epoch rather than at the beginning of training only.

    Do a random validation split after each epoch rather than at the beginning of training only.

    Definition Classes
    MedicalNerParams