Packages

package normalizer

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class DateNormalizer extends AnnotatorModel[DateNormalizer] with HasSimpleAnnotate[DateNormalizer]

    Try to normalize dates in chunks annotations.

    Try to normalize dates in chunks annotations. The expected format for the date will be YYYY/MM/DD. If the date is normalized then field normalized in metadata will be true else will be false.

    Example

    Define a pipeline with 2 different NER models with a ChunkMergeApproach at the end

    val df = Seq(("08/02/2018"),("11/2018"),("11/01/2018"),("next monday"),("today"),("next week")).toDF("text")
    
    val documentAssembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")
    
    val chunksDF = documentAssembler
                              .transform(df)
                              .mapAnnotationsCol[Seq[Annotation]]("document",
                                                                  "chunk_date",
                                                                   CHUNK,
                                                              (aa:Seq[Annotation]) =>
                                                                aa.map( ann => ann.copy(annotatorType = CHUNK)
    )
    )
    val dateNormalizerModel = new DateNormalizer()
            .setInputCols("chunk_date")
            .setOutputCol("date")
            .setAnchorDateDay(15)
            .setAnchorDateMonth(3)
            .setAnchorDateYear(2000)
    val dateDf = dateNormalizerModel.transform(chunksDF)

    Show results

    dateDf.select("chunk_date.result","text").show()
      +-------------+-----------+
      |       result|       text|
      +-------------+-----------+
      | [08/02/2018]| 08/02/2018|
      |    [11/2018]|    11/2018|
      | [11/01/2018]| 11/01/2018|
      |[next monday]|next monday|
      |      [today]|      today|
      |  [next week]|  next week|
      +-------------+-----------+
  2. case class MyCalendar(year: Try[Int], month: Try[Int], day: Try[Int]) extends Product with Serializable

Value Members

  1. object DateHelper

Ungrouped