Flattener

Companion object Flattener

class Flattener extends Transformer with ParamsAndFeaturesWritable

Converts annotation results into exploded and flattened format. It is useful to extract the results from Spark NLP Pipelines. The Flattener outputs annotation(s) values into String.

Example

 val dataSet = Seq("GENERAL: He is an elderly gentleman in no acute distress. He is sitting up in bed eating his breakfast." +
" He is alert and oriented and answering questions appropriately.\nHEENT: Sclerae showed mild arcus senilis in the right." +
" Left was clear. Pupils are equally round and reactive to light. Extraocular movements are intact. Oropharynx is clear." +
"\nNECK: Supple. Trachea is midline. No jugular venous pressure distention is noted. No adenopathy in the cervical, " +
"supraclavicular, or axillary areas.\nABDOMEN: Soft and not tender. There may be some fullness in the left upper quadrant, " +
"although I do not appreciate a true spleen with inspiration.\nEXTREMITIES: There is some edema, but no cyanosis and " ).toDS.toDF("text")


val documentAssembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")
val sentenceDetector = new SentenceDetector().setInputCols(Array("document")).setOutputCol("sentence")
val tokenizer = new Tokenizer().setInputCols(Array("sentence")).setOutputCol("token")
val wordEmbeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models").setInputCols(Array("sentence", "token")).setOutputCol("embeddings")
val clinicalNer = MedicalNerModel.pretrained("ner_jsl", "en", "clinical/models").setInputCols(Array("sentence", "token", "embeddings")).setOutputCol("ner")
val nerConverter = new NerConverter().setInputCols(Array("sentence", "token", "ner")).setOutputCol("ner_chunk")
val clinicalAssertion = AssertionDLModel.pretrained("assertion_jsl_augmented", "en", "clinical/models").setInputCols(Array("sentence", "ner_chunk", "embeddings")).setOutputCol("assertion").setEntityAssertionCaseSensitive(false)

val flattener = new Flattener()
 .setInputCols("sentence", "ner_chunk", "assertion")
 .setExplodeSelectedFields(Map("ner_chunk" -> Array("result","metadata.entity"),
                               "assertion"->Array("result","metadata.confidence")))

val pipeline = new Pipeline().setStages(
 Array(
   documentAssembler,
   sentenceDetector,
   tokenizer,
   wordEmbeddings,
   clinicalNer,
   nerConverter,
   clinicalAssertion,
   flattener
 ))

 val result = pipeline.fit(dataSet).transform(dataSet)
 result.show(false)

    +----------------------------------+-------------------------+----------------+-----------------------------+
    |ner_chunk_result                  |ner_chunk_metadata_entity|assertion_result|assertion_metadata_confidence|
    +----------------------------------+-------------------------+----------------+-----------------------------+
    |distress                          |Symptom                  |Absent          |1.0                          |
    |arcus senilis                     |Disease_Syndrome_Disorder|Past            |1.0                          |
    |jugular venous pressure distention|Symptom                  |Absent          |1.0                          |
    |adenopathy                        |Symptom                  |Absent          |1.0                          |
    |tender                            |Symptom                  |Absent          |1.0                          |
    |fullness                          |Symptom                  |Possible        |0.9999                       |
    |edema                             |Symptom                  |Present         |1.0                          |
    |cyanosis                          |VS_Finding               |Absent          |1.0                          |
    +----------------------------------+-------------------------+----------------+-----------------------------+

Linear Supertypes

ParamsAndFeaturesWritable, HasFeatures, DefaultParamsWritable, MLWritable, Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any

Ordering

Grouped
Alphabetic
By Inheritance

Inherited

Flattener
ParamsAndFeaturesWritable
HasFeatures
DefaultParamsWritable
MLWritable
Transformer
PipelineStage
Logging
Params
Serializable
Serializable
Identifiable
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new Flattener()
new Flattener(uid: String)
uid
required uid for storing annotator to disk

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def $[T](param: Param[T]): T

Attributes
protected
Definition Classes
Params
def $$[T](feature: StructFeature[T]): T

Attributes
protected
Definition Classes
HasFeatures
def $$[K, V](feature: MapFeature[K, V]): Map[K, V]

Attributes
protected
Definition Classes
HasFeatures
def $$[T](feature: SetFeature[T]): Set[T]

Attributes
protected
Definition Classes
HasFeatures
def $$[T](feature: ArrayFeature[T]): Array[T]

Attributes
protected
Definition Classes
HasFeatures
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
val cleanAnnotations: BooleanParam
Whether to remove annotation columns (Default: true)
final def clear(param: Param[_]): Flattener.this.type

Definition Classes
Params
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
def copy(extra: ParamMap): Transformer

Definition Classes
Flattener → Transformer → PipelineStage → Params
def copyValues[T <: Params](to: T, extra: ParamMap): T

Attributes
protected
Definition Classes
Params
final def defaultCopy[T <: Params](extra: ParamMap): T

Attributes
protected
Definition Classes
Params
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def explainParam(param: Param[_]): String

Definition Classes
Params
def explainParams(): String

Definition Classes
Params
val explodeSelectedFields: MapFeature[String, Array[String]]
When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data.
When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data. This allows you to choose and explode only the desired fields.
If explodeSelectedFields is not set, the transformation will return all information for the specified columns.
Alias can be given with as
(e.g., Map("ner_chunk" -> Array("result","metadata.entity as entity1")))
final def extractParamMap(): ParamMap

Definition Classes
Params
final def extractParamMap(extra: ParamMap): ParamMap

Definition Classes
Params
val features: ArrayBuffer[Feature[_, _, _]]

Definition Classes
HasFeatures
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
val flattenExplodedColumns: BooleanParam
When it is true(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.
When it is true(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.
When set to false , the transformation returns exploded columns without flattening
def get[T](feature: StructFeature[T]): Option[T]

Attributes
protected
Definition Classes
HasFeatures
def get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]

Attributes
protected
Definition Classes
HasFeatures
def get[T](feature: SetFeature[T]): Option[Set[T]]

Attributes
protected
Definition Classes
HasFeatures
def get[T](feature: ArrayFeature[T]): Option[Array[T]]

Attributes
protected
Definition Classes
HasFeatures
final def get[T](param: Param[T]): Option[T]

Definition Classes
Params
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
final def getDefault[T](param: Param[T]): Option[T]

Definition Classes
Params
def getExplodeSelectedFields: Map[String, Array[String]]
def getInputCols: Array[String]
Name of flattener input cols
final def getOrDefault[T](param: Param[T]): T

Definition Classes
Params
def getParam(paramName: String): Param[Any]

Definition Classes
Params
final def hasDefault[T](param: Param[T]): Boolean

Definition Classes
Params
def hasParam(paramName: String): Boolean

Definition Classes
Params
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

Attributes
protected
Definition Classes
Logging
def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes
protected
Definition Classes
Logging
val inputCols: StringArrayParam
names of input annotation columns for the transformation.
names of input annotation columns for the transformation. If explodeSelectedFields is not set, the transformation will return all information for the specified columns.
final def isDefined(param: Param[_]): Boolean

Definition Classes
Params
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def isSet(param: Param[_]): Boolean

Definition Classes
Params
def isTraceEnabled(): Boolean

Attributes
protected
Definition Classes
Logging
val keepOriginalColumns: StringArrayParam
An array of column names that should be kept in the DataFrame after the flattening process.
An array of column names that should be kept in the DataFrame after the flattening process. These columns will not be affected by the flattening operation and will be included in the final output as they are.
def log: Logger

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logName: String

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def onWrite(path: String, spark: SparkSession): Unit

Attributes
protected
Definition Classes
ParamsAndFeaturesWritable
val orderByColumn: Param[String]
Param for specifying the column by which the DataFrame should be ordered.
Param for specifying the column by which the DataFrame should be ordered. It allows you to set the column name for ordering when the DataFrame is transformed. flattenExplodedColumns must be true for ordering
val orderDescending: BooleanParam
specifying whether to order the DataFrame in descending order.
specifying whether to order the DataFrame in descending order. If set to true, the DataFrame will be ordered in descending order. If it is false(default), the DataFrame will be ordered in ascending order.
flattenExplodedColumns must be true for ordering
lazy val params: Array[Param[_]]

Definition Classes
Params
def save(path: String): Unit

Definition Classes
MLWritable
Annotations
@Since( "1.6.0" ) @throws( ... )
def set[T](feature: StructFeature[T], value: T): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
def set[K, V](feature: MapFeature[K, V], value: Map[K, V]): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
def set[T](feature: SetFeature[T], value: Set[T]): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
def set[T](feature: ArrayFeature[T], value: Array[T]): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
final def set(paramPair: ParamPair[_]): Flattener.this.type

Attributes
protected
Definition Classes
Params
final def set(param: String, value: Any): Flattener.this.type

Attributes
protected
Definition Classes
Params
final def set[T](param: Param[T], value: T): Flattener.this.type

Definition Classes
Params
def setCleanAnnotations(value: Boolean): Flattener.this.type
Whether to remove annotation columns (Default: true)
def setDefault[T](feature: StructFeature[T], value: () ⇒ T): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
def setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
def setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
def setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): Flattener.this.type

Attributes
protected
Definition Classes
HasFeatures
final def setDefault(paramPairs: ParamPair[_]*): Flattener.this.type

Attributes
protected
Definition Classes
Params
final def setDefault[T](param: Param[T], value: T): Flattener.this.type

Attributes
protected[org.apache.spark.ml]
Definition Classes
Params
def setExplodeSelectedFields(explodeSelectedFields: HashMap[String, List[String]]): Flattener.this.type
def setExplodeSelectedFields(map: Map[String, Array[String]]): Flattener.this.type
When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data.
When it is set to an array of specific fields the transformation returns an exploded column for each specified field containing annotation data. This allows you to choose and explode only the desired fields.
If explodeSelectedFields is not set, the transformation will return all information for the specified columns.
Alias can be given with as
(e.g., Map("ner_chunk" -> Array("result","metadata.entity as entity1")))
def setFlattenExplodedColumns(bool: Boolean): Flattener.this.type
When it istrue(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.
When it istrue(the default), the transformation returns a flattened and exploded columns containing annotation data, providing a comprehensive view of the annotated information.
When set to false , the transformation returns exploded columns without flattening
def setInputCols(value: String*): Flattener.this.type
Sets the names of input annotation columns for the transformation.
Sets the names of input annotation columns for the transformation. If explodeSelectedFields is not set (default), the transformation will return all information for the specified columns.
def setInputCols(value: Array[String]): Flattener.this.type
Sets the names of input annotation columns for the transformation.
Sets the names of input annotation columns for the transformation. If explodeSelectedFields is not set (default), the transformation will return all information for the specified columns.
def setKeepOriginalColumns(value: Array[String]): Flattener.this.type
An array of column names that should be kept in the DataFrame after the flattening process.
An array of column names that should be kept in the DataFrame after the flattening process. These columns will not be affected by the flattening operation and will be included in the final output as they are.
def setOrderByColumn(value: String): Flattener.this.type
Sets the column by which the DataFrame should be ordered when transformed.
def setOrderDescending(bool: Boolean): Flattener.this.type
Sets whether to order the DataFrame in descending order.
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
Identifiable → AnyRef → Any
def transform(dataset: Dataset[_]): Dataset[Row]

Definition Classes
Flattener → Transformer
def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame

Definition Classes
Transformer
Annotations
@Since( "2.0.0" )
def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame

Definition Classes
Transformer
Annotations
@Since( "2.0.0" ) @varargs()
def transformSchema(schema: StructType): StructType

Definition Classes
Flattener → PipelineStage
def transformSchema(schema: StructType, logging: Boolean): StructType

Attributes
protected
Definition Classes
PipelineStage
Annotations
@DeveloperApi()
val uid: String

Definition Classes
Flattener → Identifiable
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
def write: MLWriter

Definition Classes
ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable

Inherited from ParamsAndFeaturesWritable

Inherited from HasFeatures

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.

Packages

Flattener

Companion object Flattener

class Flattener extends Transformer with ParamsAndFeaturesWritable

Example

Instance Constructors

Value Members

Inherited from ParamsAndFeaturesWritable

Inherited from HasFeatures

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

Members

Parameter setters

Parameter getters

Packages

Flattener 

Companion object Flattener

class Flattener extends Transformer with ParamsAndFeaturesWritable

Example

Instance Constructors

Value Members

Inherited from ParamsAndFeaturesWritable

Inherited from HasFeatures

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Parameters

Members

Parameter setters

Parameter getters

Flattener