Explain Document DL

Description

The explain_document_dl is a pretrained pipeline that we can use to process text with a simple pipeline that performs basic processing steps.

Open in Colab Download

How to use



pipeline = PretrainedPipeline('explain_document_dl', lang =' en').annotate(' Hello world!')

Results

+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
|                text|            document|            sentence|               token|               spell|              lemmas|               stems|                 pos|
+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
|French author who...|[[document, 0, 23...|[[document, 0, 57...|[[token, 0, 5, Fr...|[[token, 0, 5, Fr...|[[token, 0, 5, Fr...|[[token, 0, 5, fr...|[[pos, 0, 5, JJ, ...|
+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+

Model Information

Model Name: explain_document_dl
Type: pipeline
Compatibility: Spark NLP 2.5.5+
License: Open Source
Edition: Community
Language: [en]

Included Models

The explain_document_ml has one Transformer and six annotators:

  • Documenssembler - A Transformer that creates a column that contains documents.
  • Sentence Segmenter - An annotator that produces the sentences of the document.
  • Tokenizer - An annotator that produces the tokens of the sentences.
  • SpellChecker - An annotator that produces the spelling-corrected tokens.
  • Stemmer - An annotator that produces the stems of the tokens.
  • Lemmatizer - An annotator that produces the lemmas of the tokens.
  • POS Tagger - An annotator that produces the parts of speech of the associated tokens.