sequence_classification

package sequence_classification

Ordering

Alphabetic

Visibility

Public
All

Type Members

class FinanceBertForSequenceClassification extends MedicalBertForSequenceClassification

MedicalBertForSequenceClassification can load Bert Models with sequence classification/regression head on top (a linear layer on top of the pooled output) e.g.

MedicalBertForSequenceClassification can load Bert Models with sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for multi-class document classification tasks.

Pretrained models can be loaded with pretrained of the companion object:

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained()
  .setInputCols("token", "document")
  .setOutputCol("label")

The default model is "bert_sequence_classifier_ade", if no name is provided.

For available pretrained models please see the Models Hub.

Models from the HuggingFace 🤗 Transformers library are also compatible with Spark NLP 🚀. The Spark NLP Workshop example shows how to import them https://github.com/JohnSnowLabs/spark-nlp/discussions/5669.

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols("document")
  .setOutputCol("token")

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained()
  .setInputCols("token", "document")
  .setOutputCol("label")
  .setCaseSensitive(true)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  tokenizer,
  sequenceClassifier
))

val data = Seq("John Lenon was born in London and lived in Paris. My name is Sarah and I live in London").toDF("text")
val result = pipeline.fit(data).transform(data)

result.select("label.result").show(false)
+--------------------+
|result              |
+--------------------+
|[True, False]       |
+--------------------+

See also: MedicalBertForSequenceClassification for sequnece-level classification
Annotators Main Page for a list of transformer based classifiers

class FinanceClassifierDLApproach extends ClassifierDLApproach

MedicalBertForSequenceClassification can load Bert Models with sequence classification/regression head on top (a linear layer on top of the pooled output) e.g.

Pretrained models can be loaded with pretrained of the companion object:

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained()
  .setInputCols("token", "document")
  .setOutputCol("label")

The default model is "bert_sequence_classifier_ade", if no name is provided.

For available pretrained models please see the Models Hub.

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols("document")
  .setOutputCol("token")

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained()
  .setInputCols("token", "document")
  .setOutputCol("label")
  .setCaseSensitive(true)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  tokenizer,
  sequenceClassifier
))

val data = Seq("John Lenon was born in London and lived in Paris. My name is Sarah and I live in London").toDF("text")
val result = pipeline.fit(data).transform(data)

result.select("label.result").show(false)
+--------------------+
|result              |
+--------------------+
|[True, False]       |
+--------------------+

See also: MedicalBertForSequenceClassification for sequnece-level classification
Annotators Main Page for a list of transformer based classifiers

class FinanceClassifierDLModel extends ClassifierDLModel

MedicalBertForSequenceClassification can load Bert Models with sequence classification/regression head on top (a linear layer on top of the pooled output) e.g.

Pretrained models can be loaded with pretrained of the companion object:

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained()
  .setInputCols("token", "document")
  .setOutputCol("label")

The default model is "bert_sequence_classifier_ade", if no name is provided.

For available pretrained models please see the Models Hub.

Example

import spark.implicits._
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols("document")
  .setOutputCol("token")

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained()
  .setInputCols("token", "document")
  .setOutputCol("label")
  .setCaseSensitive(true)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  tokenizer,
  sequenceClassifier
))

val data = Seq("John Lenon was born in London and lived in Paris. My name is Sarah and I live in London").toDF("text")
val result = pipeline.fit(data).transform(data)

result.select("label.result").show(false)
+--------------------+
|result              |
+--------------------+
|[True, False]       |
+--------------------+

See also: MedicalBertForSequenceClassification for sequnece-level classification
Annotators Main Page for a list of transformer based classifiers

class FinanceDocumentMLClassifierApproach extends DocumentMLClassifierApproach

Trains a model to classify documents with a Logarithmic Regression algorithm.

Trains a model to classify documents with a Logarithmic Regression algorithm. Training data requires columns for text and their label. The result is a trained DocumentLogRegClassifierModel.

Example

Define pipeline stages to prepare the data

val document_assembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols("document")
  .setOutputCol("token")

val normalizer = new Normalizer()
  .setInputCols("token")
  .setOutputCol("normalized")

val stopwords_cleaner = new StopWordsCleaner()
  .setInputCols("normalized")
  .setOutputCol("cleanTokens")
  .setCaseSensitive(false)

val stemmer = new Stemmer()
  .setInputCols("cleanTokens")
  .setOutputCol("stem")

Define the document classifier and fit training data to it

val logreg = new DocumentLogRegClassifierApproach()
  .setInputCols("stem")
  .setLabelCol("category")
  .setOutputCol("prediction")

val pipeline = new Pipeline().setStages(Array(
  document_assembler,
  tokenizer,
  normalizer,
  stopwords_cleaner,
  stemmer,
  logreg
))

val model = pipeline.fit(trainingData)

See also: DocumentLogRegClassifierModel for instantiated models

class FinanceDocumentMLClassifierModel extends DocumentMLClassifierModel
Classifies documents with a Logarithmic Regression algorithm.
Classifies documents with a Logarithmic Regression algorithm. Currently there are no pretrained models available. Please see DocumentLogRegClassifierApproach to train your own model.
Please check out the Models Hub for available models in the future.
class FinanceFewShotClassifierApproach extends FewShotClassifierApproach
class FinanceFewShotClassifierModel extends FewShotClassifierModel
trait ReadFinanceBertForSequenceClassificationTensorflowModel extends ReadTensorflowModel
trait ReadablePretrainedFinanceBertForSequenceModel extends ParamsAndFeaturesReadable[FinanceBertForSequenceClassification] with HasPretrained[FinanceBertForSequenceClassification]

Value Members

object FinanceBertForSequenceClassification extends ReadablePretrainedFinanceBertForSequenceModel with ReadFinanceBertForSequenceClassificationTensorflowModel with Serializable
This is the companion object of FinanceBertForSequenceClassification.
This is the companion object of FinanceBertForSequenceClassification. se refer to that class for the documentation.
object FinanceClassifierDLModel extends ReadablePretrainedClassifierDL with ReadClassifierDLTensorflowModel with Serializable
object FinanceDocumentMLClassifierModel extends ReadablePretrainedDocumentMLClassifierModel with Serializable
object FinanceFewShotClassifierApproach extends FinanceFewShotClassifierApproach
object FinanceFewShotClassifierModel extends ReadsGenericClassifierGraph[FinanceFewShotClassifierModel] with ReadablePretrainedGenericClassifier[FinanceFewShotClassifierModel] with Serializable

Packages

sequence_classification

package sequence_classification

Type Members

Example

Example

Example

Example

Value Members

Ungrouped

Packages

sequence_classification 

package sequence_classification

Type Members

Example

Example

Example

Example

Value Members

Ungrouped

sequence_classification