Packages

package parser

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. case class ClassificationSchema(classification_column_name: String, sentence_column_name: String) extends Serializable with Product

    ClassificationSchema is a case class that represents the schema of the classification columns.

  2. case class ConverterSchema(document_identifier: Option[String], document_text: Option[String], entities: Option[List[String]], assertions: Option[List[String]], resolutions: Option[List[ResolutionSchema]], relations: Option[List[String]], summaries: Option[List[String]], deidentifications: Option[List[DeIdentificationSchema]], classifications: Option[List[ClassificationSchema]]) extends Serializable with Product

    ConverterSchema is a case class that represents the schema of the input data to the StructuredJsonConverter.

    ConverterSchema is a case class that represents the schema of the input data to the StructuredJsonConverter. The StructuredJsonConverter is an Annotator that converts the output of a pipeline to a structured JSON format. The ConverterSchema is used to specify the schema of the input data to the StructuredJsonConverter.

    document_identifier

    The identifier of the document.

    document_text

    Document column name.

    entities

    Chunk column names.

    assertions

    Assertion column names.

    resolutions

    Resolution column names.

    relations

    Relation column names.

    summaries

    Summary column names.

    deidentifications

    De-identification column names.

  3. case class DeIdentificationSchema(original: String, obfuscated: String, masked: String) extends Serializable with Product

    DeIdentificationSchema is a case class that represents the schema of the de-identification columns.

  4. case class JsonConverterDict(document_identifier: String, document_text: Seq[String], entities: Seq[Map[String, String]], assertions: Seq[Map[String, String]], resolutions: Seq[Map[String, String]], relations: Seq[Map[String, String]], summaries: Seq[String], deidentifications: Seq[Map[String, String]], classifications: Seq[Map[String, String]]) extends Serializable with Product
    Attributes
    protected
  5. case class ResolutionSchema(vocab: String, resolver_column_name: String) extends Serializable with Product

    ResolutionSchema is a case class that represents the schema of the resolution columns.

  6. class StructuredJsonConverter extends Transformer with HasOutputAnnotationCol with ParamsAndFeaturesWritable with CheckLicense

    StructuredJsonConverter is a transformer that converts the output of the pipeline into a structured JSON format.

    StructuredJsonConverter is a transformer that converts the output of the pipeline into a structured JSON format. The output can be a string or a struct, depending on the value of the outputAsStr parameter. The schema of the input columns is defined by the ConverterSchema case class, which outlines the structure of input columns. The schema includes fields for the document identifier, document text, entities, assertions, resolutions, relations, summaries, deidentifications, and classifications. The ConverterSchema case class provides methods for parsing the schema from a JSON string and extracting column names from the input schema. The transformer includes parameters for setting the schema, returning entities in relations, removing spark-nlp annotation columns, and outputting the result as a string or a structured JSON. The transformer checks the input columns and document identifier column and ensures that the input columns are compatible with the transformer. PipelineParser class can be used to extract the schema from a pipeline.

    Note

    document_identifier field is empty or not found in the input schema, a random UUID will be generated. If the document_identifier field is found in the input schema and It is not the column name, the value of the document_identifier field will be used. If the document_identifier field is found in the input schema and It is the column name, the column must be of type StringType.

Value Members

  1. object ConverterSchema extends Serializable
  2. object StructuredJsonConverter extends ParamsAndFeaturesReadable[StructuredJsonConverter] with Serializable

    This is the companion object of StructuredJsonConverter.

    This is the companion object of StructuredJsonConverter. Please refer to that class for the documentation.

  3. object UniqueIdGenerator

    UniqueIdGenerator is a utility object that generates a unique identifier for a given input.

    UniqueIdGenerator is a utility object that generates a unique identifier for a given input.

    Attributes
    protected

Ungrouped