Packages

class Flattener extends Transformer with ParamsAndFeaturesWritable

Converts annotation results into exploded and flattened format. It is useful to extract the results from Spark NLP Pipelines. The Flattener outputs annotation(s) values into String.

Example

 val dataSet = Seq("GENERAL: He is an elderly gentleman in no acute distress. He is sitting up in bed eating his breakfast." +
" He is alert and oriented and answering questions appropriately.\nHEENT: Sclerae showed mild arcus senilis in the right." +
" Left was clear. Pupils are equally round and reactive to light. Extraocular movements are intact. Oropharynx is clear." +
"\nNECK: Supple. Trachea is midline. No jugular venous pressure distention is noted. No adenopathy in the cervical, " +
"supraclavicular, or axillary areas.\nABDOMEN: Soft and not tender. There may be some fullness in the left upper quadrant, " +
"although I do not appreciate a true spleen with inspiration.\nEXTREMITIES: There is some edema, but no cyanosis and " ).toDS.toDF("text")


val documentAssembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")
val sentenceDetector = new SentenceDetector().setInputCols(Array("document")).setOutputCol("sentence")
val tokenizer = new Tokenizer().setInputCols(Array("sentence")).setOutputCol("token")
val wordEmbeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models").setInputCols(Array("sentence", "token")).setOutputCol("embeddings")
val clinicalNer = MedicalNerModel.pretrained("ner_jsl", "en", "clinical/models").setInputCols(Array("sentence", "token", "embeddings")).setOutputCol("ner")
val nerConverter = new NerConverter().setInputCols(Array("sentence", "token", "ner")).setOutputCol("ner_chunk")
val clinicalAssertion = AssertionDLModel.pretrained("assertion_jsl_augmented", "en", "clinical/models").setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion").setEntityAssertionCaseSensitive(false)

val flattener = new Flattener()
 .setInputCols("sentence", "ner_chunk", "assertion")
 .setExplodeSelectedFields(Map("ner_chunk" -> Array("result","metadata.entity"),
                               "assertion"->Array("result","metadata.confidence")))

val pipeline = new Pipeline().setStages(
 Array(
   documentAssembler,
   sentenceDetector,
   tokenizer,
   wordEmbeddings,
   clinicalNer,
   nerConverter,
   clinicalAssertion,
   flattener
 ))

 val result = pipeline.fit(dataSet).transform(dataSet)
 result.show(false)

    +----------------------------------+-------------------------+----------------+-----------------------------+
    |ner_chunk_result                  |ner_chunk_metadata_entity|assertion_result|assertion_metadata_confidence|
    +----------------------------------+-------------------------+----------------+-----------------------------+
    |distress                          |Symptom                  |Absent          |1.0                          |
    |arcus senilis                     |Disease_Syndrome_Disorder|Past            |1.0                          |
    |jugular venous pressure distention|Symptom                  |Absent          |1.0                          |
    |adenopathy                        |Symptom                  |Absent          |1.0                          |
    |tender                            |Symptom                  |Absent          |1.0                          |
    |fullness                          |Symptom                  |Possible        |0.9999                       |
    |edema                             |Symptom                  |Present         |1.0                          |
    |cyanosis                          |VS_Finding               |Absent          |1.0                          |
    +----------------------------------+-------------------------+----------------+-----------------------------+
Linear Supertypes
ParamsAndFeaturesWritable, HasFeatures, DefaultParamsWritable, MLWritable, Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. Flattener
  2. ParamsAndFeaturesWritable
  3. HasFeatures
  4. DefaultParamsWritable
  5. MLWritable
  6. Transformer
  7. PipelineStage
  8. Logging
  9. Params
  10. Serializable
  11. Serializable
  12. Identifiable
  13. AnyRef
  14. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Flattener()
  2. new Flattener(uid: String)

    uid

    required uid for storing annotator to disk

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. def $$[T](feature: StructFeature[T]): T
    Attributes
    protected
    Definition Classes
    HasFeatures
  5. def $$[K, V](feature: MapFeature[K, V]): Map[K, V]
    Attributes
    protected
    Definition Classes
    HasFeatures
  6. def $$[T](feature: SetFeature[T]): Set[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  7. def $$[T](feature: ArrayFeature[T]): Array[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  8. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  10. val cleanAnnotations: BooleanParam

    Whether to remove annotation columns (Default: true)

  11. final def clear(param: Param[_]): Flattener.this.type
    Definition Classes
    Params
  12. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  13. def copy(extra: ParamMap): Transformer
    Definition Classes
    Flattener → Transformer → PipelineStage → Params
  14. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  15. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  16. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  17. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  18. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  19. def explainParams(): String
    Definition Classes
    Params
  20. val explodeSelectedFields: MapFeature[String, Array[String]]

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data.

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data. This allows you to choose and explode only the desired fields.

    If explodeSelectedFields is not set, the transformation will return all information for the specified columns.

    Alias can be given with as

    (e.g., Map("ner_chunk" -> Array("result","metadata.entity as entity1")))

  21. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  22. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  23. val features: ArrayBuffer[Feature[_, _, _]]
    Definition Classes
    HasFeatures
  24. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  25. val flattenExplodedColumns: BooleanParam

    When it is true(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When it is true(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When set to false , the transformation returns exploded columns without flattening

  26. def get[T](feature: StructFeature[T]): Option[T]
    Attributes
    protected
    Definition Classes
    HasFeatures
  27. def get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  28. def get[T](feature: SetFeature[T]): Option[Set[T]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  29. def get[T](feature: ArrayFeature[T]): Option[Array[T]]
    Attributes
    protected
    Definition Classes
    HasFeatures
  30. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  31. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  32. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  33. def getExplodeSelectedFields: Map[String, Array[String]]
  34. def getInputCols: Array[String]

    Name of flattener input cols

  35. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  36. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  37. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  38. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  39. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  40. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  41. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  42. val inputCols: StringArrayParam

    names of input annotation columns for the transformation.

    names of input annotation columns for the transformation. If explodeSelectedFields is not set, the transformation will return all information for the specified columns.

  43. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  44. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  45. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  46. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  47. val keepOriginalColumns: StringArrayParam

    An array of column names that should be kept in the DataFrame after the flattening process.

    An array of column names that should be kept in the DataFrame after the flattening process. These columns will not be affected by the flattening operation and will be included in the final output as they are.

  48. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  49. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  50. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  51. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  52. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  53. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  55. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  56. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  57. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  58. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  59. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  60. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  61. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  62. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  63. def onWrite(path: String, spark: SparkSession): Unit
    Attributes
    protected
    Definition Classes
    ParamsAndFeaturesWritable
  64. val orderByColumn: Param[String]

    Param for specifying the column by which the DataFrame should be ordered.

    Param for specifying the column by which the DataFrame should be ordered. It allows you to set the column name for ordering when the DataFrame is transformed. flattenExplodedColumns must be true for ordering

  65. val orderDescending: BooleanParam

    specifying whether to order the DataFrame in descending order.

    specifying whether to order the DataFrame in descending order. If set to true, the DataFrame will be ordered in descending order. If it is false(default), the DataFrame will be ordered in ascending order.

    flattenExplodedColumns must be true for ordering

  66. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  67. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  68. def set[T](feature: StructFeature[T], value: T): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  69. def set[K, V](feature: MapFeature[K, V], value: Map[K, V]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  70. def set[T](feature: SetFeature[T], value: Set[T]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  71. def set[T](feature: ArrayFeature[T], value: Array[T]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  72. final def set(paramPair: ParamPair[_]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  73. final def set(param: String, value: Any): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  74. final def set[T](param: Param[T], value: T): Flattener.this.type
    Definition Classes
    Params
  75. def setCleanAnnotations(value: Boolean): Flattener.this.type

    Whether to remove annotation columns (Default: true)

  76. def setDefault[T](feature: StructFeature[T], value: () ⇒ T): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  77. def setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  78. def setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  79. def setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    HasFeatures
  80. final def setDefault(paramPairs: ParamPair[_]*): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  81. final def setDefault[T](param: Param[T], value: T): Flattener.this.type
    Attributes
    protected[org.apache.spark.ml]
    Definition Classes
    Params
  82. def setExplodeSelectedFields(explodeSelectedFields: HashMap[String, List[String]]): Flattener.this.type
  83. def setExplodeSelectedFields(map: Map[String, Array[String]]): Flattener.this.type

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data.

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data. This allows you to choose and explode only the desired fields.

    If explodeSelectedFields is not set, the transformation will return all information for the specified columns.

    Alias can be given with as

    (e.g., Map("ner_chunk" -> Array("result","metadata.entity as entity1")))

  84. def setFlattenExplodedColumns(bool: Boolean): Flattener.this.type

    When it istrue(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When it istrue(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When set to false , the transformation returns exploded columns without flattening

  85. def setInputCols(value: String*): Flattener.this.type

    Sets the names of input annotation columns for the transformation.

    Sets the names of input annotation columns for the transformation. If explodeSelectedFields is not set (default), the transformation will return all information for the specified columns.

  86. def setInputCols(value: Array[String]): Flattener.this.type

    Sets the names of input annotation columns for the transformation.

    Sets the names of input annotation columns for the transformation. If explodeSelectedFields is not set (default), the transformation will return all information for the specified columns.

  87. def setKeepOriginalColumns(value: Array[String]): Flattener.this.type

    An array of column names that should be kept in the DataFrame after the flattening process.

    An array of column names that should be kept in the DataFrame after the flattening process. These columns will not be affected by the flattening operation and will be included in the final output as they are.

  88. def setOrderByColumn(value: String): Flattener.this.type

    Sets the column by which the DataFrame should be ordered when transformed.

  89. def setOrderDescending(bool: Boolean): Flattener.this.type

    Sets whether to order the DataFrame in descending order.

  90. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  91. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  92. def transform(dataset: Dataset[_]): Dataset[Row]
    Definition Classes
    Flattener → Transformer
  93. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  94. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  95. def transformSchema(schema: StructType): StructType
    Definition Classes
    Flattener → PipelineStage
  96. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  97. val uid: String
    Definition Classes
    Flattener → Identifiable
  98. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  99. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  100. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  101. def write: MLWriter
    Definition Classes
    ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable

Inherited from ParamsAndFeaturesWritable

Inherited from HasFeatures

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.

Members

Parameter setters

Parameter getters