Packages

class AssertionLogRegApproach extends AnnotatorApproach[AssertionLogRegModel] with Windowing with CheckLicense

This is a classification method, which uses LogisticRegression algorithm Contains all the methods for training a AssertionLogRegModel, together with trainWithChunk, trainWithStartEnd.

Example

Training with Glove Embeddings

First define pipeline stages to extract embeddings and text chunks

 val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols("document")
  .setOutputCol("token")

val glove = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
  .setInputCols("document", "token")
  .setOutputCol("word_embeddings")
  .setCaseSensitive(false)

val chunkAssembler = new Doc2Chunk()
  .setInputCols("document")
  .setChunkCol("target")
  .setOutputCol("chunk")

Then the AssertionLogRegApproach model is defined. Label column is needed in the dataset for training.

val assertion = new AssertionLogRegApproach()
  .setLabelCol("label")
  .setInputCols("document", "chunk", "word_embeddings")
  .setOutputCol("assertion")
  .setReg(0.01)
  .setBefore(11)
  .setAfter(13)
  .setStartCol("start")
  .setEndCol("end")

val assertionPipeline = new Pipeline().setStages(Array(
  documentAssembler,
  sentenceDetector,
  tokenizer,
  embeddings,
  nerModel,
  nerConverter,
  assertion
))

val assertionModel = assertionPipeline.fit(dataset)
Linear Supertypes
CheckLicense, Windowing, AnnotatorApproach[AssertionLogRegModel], CanBeLazy, DefaultParamsWritable, MLWritable, HasOutputAnnotatorType, HasOutputAnnotationCol, HasInputAnnotationCols, Estimator[AssertionLogRegModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. AssertionLogRegApproach
  2. CheckLicense
  3. Windowing
  4. AnnotatorApproach
  5. CanBeLazy
  6. DefaultParamsWritable
  7. MLWritable
  8. HasOutputAnnotatorType
  9. HasOutputAnnotationCol
  10. HasInputAnnotationCols
  11. Estimator
  12. PipelineStage
  13. Logging
  14. Params
  15. Serializable
  16. Serializable
  17. Identifiable
  18. AnyRef
  19. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new AssertionLogRegApproach()
  2. new AssertionLogRegApproach(uid: String)

    uid

    a unique identifier for the instantiated AnnotatorModel

Type Members

  1. type AnnotatorType = String
    Definition Classes
    HasOutputAnnotatorType
  2. case class VectorizedChunk(vector: Vector, begin: Int, end: Int, sentenceId: Int, chunkId: Int) extends Product with Serializable
    Attributes
    protected
    Definition Classes
    Windowing

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. def _fit(dataset: Dataset[_], recursiveStages: Option[PipelineModel]): AssertionLogRegModel
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  6. lazy val after: Int
    Definition Classes
    AssertionLogRegApproachWindowing
  7. val afterParam: IntParam

    Amount of tokens from the context after the target (Default: 10)

  8. def applyWindow(tokenizedSentence: WordpieceEmbeddingsSentence, s: Int, e: Int, embeddingsDim: Int): Array[Double]
    Definition Classes
    Windowing
  9. def applyWindowContext(tokenizedSentence: WordpieceEmbeddingsSentence, s: Int, e: Int, embeddingsDim: Int): (Array[Array[Float]], Array[Array[Float]], Array[Array[Float]])
    Definition Classes
    Windowing
  10. def applyWindowUdf(embeddingsDim: Int): UserDefinedFunction
    Definition Classes
    Windowing
  11. def applyWindowUdfChunk(embeddingsDim: Int): UserDefinedFunction
    Definition Classes
    Windowing
  12. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  13. lazy val before: Int
    Definition Classes
    AssertionLogRegApproachWindowing
  14. val beforeParam: IntParam

    Amount of tokens from the context before the target (Default: 10)

  15. def beforeTraining(spark: SparkSession): Unit
    Definition Classes
    AnnotatorApproach
  16. final def checkSchema(schema: StructType, inputAnnotatorType: String): Boolean
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  17. def checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String]): Unit
    Definition Classes
    CheckLicense
  18. def checkValidScope(scope: String): Unit
    Definition Classes
    CheckLicense
  19. def checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  20. def checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean): Unit
    Definition Classes
    CheckLicense
  21. final def clear(param: Param[_]): AssertionLogRegApproach.this.type
    Definition Classes
    Params
  22. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  23. final def copy(extra: ParamMap): Estimator[AssertionLogRegModel]
    Definition Classes
    AnnotatorApproach → Estimator → PipelineStage → Params
  24. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  25. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  26. val description: String
    Definition Classes
    AssertionLogRegApproach → AnnotatorApproach
  27. val eNetParam: DoubleParam

    Elastic net parameter (Default: 0.9)

  28. val endCol: Param[String]

    Column that contains the token number for the end of the target

  29. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  30. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  31. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  32. def explainParams(): String
    Definition Classes
    Params
  33. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  34. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  35. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  36. final def fit(dataset: Dataset[_]): AssertionLogRegModel
    Definition Classes
    AnnotatorApproach → Estimator
  37. def fit(dataset: Dataset[_], paramMaps: Seq[ParamMap]): Seq[AssertionLogRegModel]
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  38. def fit(dataset: Dataset[_], paramMap: ParamMap): AssertionLogRegModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  39. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): AssertionLogRegModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  40. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  41. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  42. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  43. def getInputCols: Array[String]
    Definition Classes
    HasInputAnnotationCols
  44. def getLazyAnnotator: Boolean
    Definition Classes
    CanBeLazy
  45. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  46. final def getOutputCol: String
    Definition Classes
    HasOutputAnnotationCol
  47. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  48. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  49. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  50. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  51. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  52. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  53. val inputAnnotatorTypes: Array[String]

    Input annotator types: DOCUMENT, CHUNK, WORD_EMBEDDINGS

    Input annotator types: DOCUMENT, CHUNK, WORD_EMBEDDINGS

    Definition Classes
    AssertionLogRegApproach → HasInputAnnotationCols
  54. final val inputCols: StringArrayParam
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  55. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  56. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  57. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  58. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  59. def l2norm(xs: Array[Double]): Double
    Definition Classes
    Windowing
  60. val label: Param[String]

    Column with one label per document

  61. val lazyAnnotator: BooleanParam
    Definition Classes
    CanBeLazy
  62. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  63. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  64. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  65. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  66. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  67. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  68. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  69. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  70. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  71. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  72. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  73. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  74. val maxIter: IntParam

    Max number of iterations for algorithm (Default: 26)

  75. def msgHelper(schema: StructType): String
    Attributes
    protected
    Definition Classes
    HasInputAnnotationCols
  76. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  77. def normalize(vec: Array[Double]): Array[Double]
    Definition Classes
    Windowing
  78. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  79. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  80. def onTrained(model: AssertionLogRegModel, spark: SparkSession): Unit
    Definition Classes
    AnnotatorApproach
  81. val optionalInputAnnotatorTypes: Array[String]
    Definition Classes
    HasInputAnnotationCols
  82. val outputAnnotatorType: AnnotatorType

    Output annotator types: ASSERTION

    Output annotator types: ASSERTION

    Definition Classes
    AssertionLogRegApproach → HasOutputAnnotatorType
  83. final val outputCol: Param[String]
    Attributes
    protected
    Definition Classes
    HasOutputAnnotationCol
  84. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  85. val regParam: DoubleParam

    Regularization parameter (Default: 0.00192)

  86. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  87. final def set(paramPair: ParamPair[_]): AssertionLogRegApproach.this.type
    Attributes
    protected
    Definition Classes
    Params
  88. final def set(param: String, value: Any): AssertionLogRegApproach.this.type
    Attributes
    protected
    Definition Classes
    Params
  89. final def set[T](param: Param[T], value: T): AssertionLogRegApproach.this.type
    Definition Classes
    Params
  90. def setAfter(a: Int): AssertionLogRegApproach.this.type

    Amount of tokens from the context after the target (Default: 10)

  91. def setBefore(b: Int): AssertionLogRegApproach.this.type

    Amount of tokens from the context before the target (Default: 10)

  92. final def setDefault(paramPairs: ParamPair[_]*): AssertionLogRegApproach.this.type
    Attributes
    protected
    Definition Classes
    Params
  93. final def setDefault[T](param: Param[T], value: T): AssertionLogRegApproach.this.type
    Attributes
    protected[org.apache.spark.ml]
    Definition Classes
    Params
  94. def setEndCol(end: String): AssertionLogRegApproach.this.type

    Column that contains the token number for the end of the target

  95. def setEnet(enet: Double): AssertionLogRegApproach.this.type

    Elastic net parameter (Default: 0.9)

  96. final def setInputCols(value: String*): AssertionLogRegApproach.this.type
    Definition Classes
    HasInputAnnotationCols
  97. def setInputCols(value: Array[String]): AssertionLogRegApproach.this.type
    Definition Classes
    HasInputAnnotationCols
  98. def setLabelCol(label: String): AssertionLogRegApproach.this.type

    Column with one label per document

  99. def setLazyAnnotator(value: Boolean): AssertionLogRegApproach.this.type
    Definition Classes
    CanBeLazy
  100. def setMaxIter(max: Int): AssertionLogRegApproach.this.type

    Max number of iterations for algorithm (Default: 26)

  101. final def setOutputCol(value: String): AssertionLogRegApproach.this.type
    Definition Classes
    HasOutputAnnotationCol
  102. def setReg(lambda: Double): AssertionLogRegApproach.this.type

    Regularization parameter (Default: 0.00192)

  103. def setStartCol(start: String): AssertionLogRegApproach.this.type

    Column that contains the token number for the start of the target

  104. val startCol: Param[String]

    Column that contains the token number for the start of the target

  105. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  106. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  107. def tokenIndexToChunkIndex(doc: Array[TokenPieceEmbeddings], start: Int, end: Int): (Int, Int)
    Definition Classes
    Windowing
  108. def train(dataset: Dataset[_], recursivePipeline: Option[PipelineModel] = None): AssertionLogRegModel

    This is a main point of interest of this class.

    This is a main point of interest of this class. It trains the dataset with recursive pipeline and uses methods trainWithChunk() and trainwithStartEnd() The choice of training happens based on the startCol value of the DL Approach

    dataset

    a collection of inputs to train

    recursivePipeline

    an instance of PipelineModel

    returns

    an instance of trained AssertionLogRegModel

    Definition Classes
    AssertionLogRegApproach → AnnotatorApproach
  109. final def transformSchema(schema: StructType): StructType
    Definition Classes
    AnnotatorApproach → PipelineStage
  110. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  111. val uid: String
    Definition Classes
    AssertionLogRegApproach → Identifiable
  112. def validate(schema: StructType): Boolean
    Attributes
    protected
    Definition Classes
    AnnotatorApproach
  113. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  114. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  115. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  116. def write: MLWriter
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from CheckLicense

Inherited from Windowing

Inherited from AnnotatorApproach[AssertionLogRegModel]

Inherited from CanBeLazy

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from HasOutputAnnotatorType

Inherited from HasOutputAnnotationCol

Inherited from HasInputAnnotationCols

Inherited from Estimator[AssertionLogRegModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

Annotator types

Required input and expected output annotator types

Members

Parameter setters