Explain Document DL Pipeline for English

Description

The explain_document_dl is a pretrained pipeline that we can use to process text with a simple pipeline that performs basic processing steps and recognizes entities . It performs most of the common text processing tasks on your dataframe

Live Demo Open in Colab Download Copy S3 URI

How to use


from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline('explain_document_dl', lang = 'en')
annotations =  pipeline.fullAnnotate("The Mona Lisa is an oil painting from the 16th century.")[0]
annotations.keys()


val pipeline = new PretrainedPipeline("explain_document_dl", lang = "en")
val result = pipeline.fullAnnotate("The Mona Lisa is an oil painting from the 16th century.")(0)



import nlu
text = ["The Mona Lisa is an oil painting from the 16th century."]
result_df = nlu.load('en.explain.dl').predict(text)
result_df

Results

+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------+-----------+
|                                              text|                                          document|                                          sentence|                                             token|                                           checked|                                             lemma|                                              stem|                                               pos|                                        embeddings|                                         ner|   entities|
+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------+-----------+
|The Mona Lisa is an oil painting from the 16th ...|[The Mona Lisa is an oil painting from the 16th...|[The Mona Lisa is an oil painting from the 16th...|[The, Mona, Lisa, is, an, oil, painting, from, ...|[The, Mona, Lisa, is, an, oil, painting, from, ...|[The, Mona, Lisa, be, an, oil, painting, from, ...|[the, mona, lisa, i, an, oil, paint, from, the,...|[DT, NNP, NNP, VBZ, DT, NN, NN, IN, DT, JJ, NN, .]|[[-0.038194, -0.24487, 0.72812, -0.39961, 0.083...|[O, B-PER, I-PER, O, O, O, O, O, O, O, O, O]|[Mona Lisa]|
+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------------+--------------------------------------------+-----------+

Model Information

Model Name: explain_document_dl
Type: pipeline
Compatibility: Spark NLP 3.0.0+
License: Open Source
Edition: Official
Language: en