Packages

class Flattener extends Transformer with DefaultParamsWritable

Converts annotation results into exploded and flattened format. It is useful to extract the results from Spark NLP Pipelines. The Flattener outputs annotation(s) values into String.

Example

 val dataSet = Seq("GENERAL: He is an elderly gentleman in no acute distress. He is sitting up in bed eating his breakfast." +
" He is alert and oriented and answering questions appropriately.\nHEENT: Sclerae showed mild arcus senilis in the right." +
" Left was clear. Pupils are equally round and reactive to light. Extraocular movements are intact. Oropharynx is clear." +
"\nNECK: Supple. Trachea is midline. No jugular venous pressure distention is noted. No adenopathy in the cervical, " +
"supraclavicular, or axillary areas.\nABDOMEN: Soft and not tender. There may be some fullness in the left upper quadrant, " +
"although I do not appreciate a true spleen with inspiration.\nEXTREMITIES: There is some edema, but no cyanosis and " ).toDS.toDF("text")


val documentAssembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")
val sentenceDetector = new SentenceDetector().setInputCols(Array("document")).setOutputCol("sentence")
val tokenizer = new Tokenizer().setInputCols(Array("sentence")).setOutputCol("token")
val wordEmbeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models").setInputCols(Array("sentence", "token")).setOutputCol("embeddings")
val clinicalNer = MedicalNerModel.pretrained("ner_jsl", "en", "clinical/models").setInputCols(Array("sentence", "token", "embeddings")).setOutputCol("ner")
val nerConverter = new NerConverter().setInputCols(Array("sentence", "token", "ner")).setOutputCol("ner_chunk")
val clinicalAssertion = AssertionDLModel.pretrained("assertion_jsl_augmented", "en", "clinical/models").setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion").setEntityAssertionCaseSensitive(false)

val flattener = new Flattener()
 .setInputCols("sentence", "ner_chunk", "assertion")
 .setExplodeSelectedFields(Map("ner_chunk" -> Array("result","metadata.entity"),
                               "assertion"->Array("result","metadata.confidence")))

val pipeline = new Pipeline().setStages(
 Array(
   documentAssembler,
   sentenceDetector,
   tokenizer,
   wordEmbeddings,
   clinicalNer,
   nerConverter,
   clinicalAssertion,
   flattener
 ))

 val result = pipeline.fit(dataSet).transform(dataSet)
 result.show(false)

    +----------------------------------+-------------------------+----------------+-----------------------------+
    |ner_chunk_result                  |ner_chunk_metadata_entity|assertion_result|assertion_metadata_confidence|
    +----------------------------------+-------------------------+----------------+-----------------------------+
    |distress                          |Symptom                  |Absent          |1.0                          |
    |arcus senilis                     |Disease_Syndrome_Disorder|Past            |1.0                          |
    |jugular venous pressure distention|Symptom                  |Absent          |1.0                          |
    |adenopathy                        |Symptom                  |Absent          |1.0                          |
    |tender                            |Symptom                  |Absent          |1.0                          |
    |fullness                          |Symptom                  |Possible        |0.9999                       |
    |edema                             |Symptom                  |Present         |1.0                          |
    |cyanosis                          |VS_Finding               |Absent          |1.0                          |
    +----------------------------------+-------------------------+----------------+-----------------------------+
Linear Supertypes
DefaultParamsWritable, MLWritable, Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. Flattener
  2. DefaultParamsWritable
  3. MLWritable
  4. Transformer
  5. PipelineStage
  6. Logging
  7. Params
  8. Serializable
  9. Serializable
  10. Identifiable
  11. AnyRef
  12. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Flattener()
  2. new Flattener(uid: String)

    uid

    required uid for storing annotator to disk

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. val cleanAnnotations: BooleanParam

    Whether to remove annotation columns (Default: true)

  7. final def clear(param: Param[_]): Flattener.this.type
    Definition Classes
    Params
  8. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  9. def copy(extra: ParamMap): Transformer
    Definition Classes
    Flattener → Transformer → PipelineStage → Params
  10. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  11. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  12. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  14. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  15. def explainParams(): String
    Definition Classes
    Params
  16. val explodeSelectedFields: Param[Map[String, Array[String]]]

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data.

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data. This allows you to choose and explode only the desired fields.

    If explodeSelectedFields is not set, the transformation will return all information for the specified columns.

    Alias can be given with as

    (e.g., Map("ner_chunk" -> Array("result","metadata.entity as entity1")))

  17. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  18. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  19. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  20. val flattenExplodedColumns: BooleanParam

    When it is true(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When it is true(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When set to false , the transformation returns exploded columns without flattening

  21. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  22. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  23. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  24. def getInputCols: Array[String]

    Name of flattener input cols

  25. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  26. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  27. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  28. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  29. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  30. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  31. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. val inputCols: StringArrayParam

    names of input annotation columns for the transformation.

    names of input annotation columns for the transformation. If explodeSelectedFields is not set, the transformation will return all information for the specified columns.

  33. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  34. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  35. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  36. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  37. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  38. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  39. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  41. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  42. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  43. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  44. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  45. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  46. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  47. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  48. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  50. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  51. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  52. val orderByColumn: Param[String]

    Param for specifying the column by which the DataFrame should be ordered.

    Param for specifying the column by which the DataFrame should be ordered. It allows you to set the column name for ordering when the DataFrame is transformed. flattenExplodedColumns must be true for ordering

  53. val orderDescending: BooleanParam

    specifying whether to order the DataFrame in descending order.

    specifying whether to order the DataFrame in descending order. If set to true, the DataFrame will be ordered in descending order. If it is false(default), the DataFrame will be ordered in ascending order.

    flattenExplodedColumns must be true for ordering

  54. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  55. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  56. final def set(paramPair: ParamPair[_]): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  57. final def set(param: String, value: Any): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  58. final def set[T](param: Param[T], value: T): Flattener.this.type
    Definition Classes
    Params
  59. def setCleanAnnotations(value: Boolean): Flattener.this.type

    Whether to remove annotation columns (Default: true)

  60. final def setDefault(paramPairs: ParamPair[_]*): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  61. final def setDefault[T](param: Param[T], value: T): Flattener.this.type
    Attributes
    protected
    Definition Classes
    Params
  62. def setExplodeSelectedFields(explodeSelectedFields: HashMap[String, List[String]]): Flattener.this.type
  63. def setExplodeSelectedFields(map: Map[String, Array[String]]): Flattener.this.type

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data.

    When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data. This allows you to choose and explode only the desired fields.

    If explodeSelectedFields is not set, the transformation will return all information for the specified columns.

    Alias can be given with as

    (e.g., Map("ner_chunk" -> Array("result","metadata.entity as entity1")))

  64. def setFlattenExplodedColumns(bool: Boolean): Flattener.this.type

    When it istrue(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When it istrue(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.

    When set to false , the transformation returns exploded columns without flattening

  65. def setInputCols(value: String*): Flattener.this.type

    Sets the names of input annotation columns for the transformation.

    Sets the names of input annotation columns for the transformation. If explodeSelectedFields is not set (default), the transformation will return all information for the specified columns.

  66. def setInputCols(value: Array[String]): Flattener.this.type

    Sets the names of input annotation columns for the transformation.

    Sets the names of input annotation columns for the transformation. If explodeSelectedFields is not set (default), the transformation will return all information for the specified columns.

  67. def setOrderByColumn(value: String): Flattener.this.type

    Sets the column by which the DataFrame should be ordered when transformed.

  68. def setOrderDescending(bool: Boolean): Flattener.this.type

    Sets whether to order the DataFrame in descending order.

  69. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  70. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  71. def transform(dataset: Dataset[_]): Dataset[Row]
    Definition Classes
    Flattener → Transformer
  72. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  73. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  74. def transformSchema(schema: StructType): StructType
    Definition Classes
    Flattener → PipelineStage
  75. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  76. val uid: String
    Definition Classes
    Flattener → Identifiable
  77. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  78. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  79. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  80. def write: MLWriter
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.

Members

Parameter setters

Parameter getters