Packages

package classification

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class DocumentLogRegClassifierApproach extends AnnotatorApproach[DocumentLogRegClassifierModel] with CheckLicense

    Trains a model to classify documents with a Logarithmic Regression algorithm.

    Trains a model to classify documents with a Logarithmic Regression algorithm. Training data requires columns for text and their label. The result is a trained DocumentLogRegClassifierModel.

    Example

    Define pipeline stages to prepare the data

    val document_assembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")
    
    val tokenizer = new Tokenizer()
      .setInputCols("document")
      .setOutputCol("token")
    
    val normalizer = new Normalizer()
      .setInputCols("token")
      .setOutputCol("normalized")
    
    val stopwords_cleaner = new StopWordsCleaner()
      .setInputCols("normalized")
      .setOutputCol("cleanTokens")
      .setCaseSensitive(false)
    
    val stemmer = new Stemmer()
      .setInputCols("cleanTokens")
      .setOutputCol("stem")

    Define the document classifier and fit training data to it

    val logreg = new DocumentLogRegClassifierApproach()
      .setInputCols("stem")
      .setLabelCol("category")
      .setOutputCol("prediction")
    
    val pipeline = new Pipeline().setStages(Array(
      document_assembler,
      tokenizer,
      normalizer,
      stopwords_cleaner,
      stemmer,
      logreg
    ))
    
    val model = pipeline.fit(trainingData)
    See also

    DocumentLogRegClassifierModel for instantiated models

  2. class DocumentLogRegClassifierModel extends Model[DocumentLogRegClassifierModel] with RawAnnotator[DocumentLogRegClassifierModel] with CanBeLazy with CheckLicense

    Classifies documents with a Logarithmic Regression algorithm.

    Classifies documents with a Logarithmic Regression algorithm. Currently there are no pretrained models available. Please see DocumentLogRegClassifierApproach to train your own model.

    Please check out the Models Hub for available models in the future.

  3. trait ReadablePretrainedDocumentLogRegClassifierModel extends ParamsAndFeaturesReadable[DocumentLogRegClassifierModel] with HasPretrained[DocumentLogRegClassifierModel]

Ungrouped