XLM-RoBERTa 40-Language NER Pipeline

Description

This pretrained pipeline is built on the top of xlm_roberta_token_classifier_ner_40_lang model.

Predicted Entities

How to use

pipeline = PretrainedPipeline("xlm_roberta_token_classifier_ner_40_lang_pipeline", lang = "xx")

pipeline.annotate(["My name is John and I work at John Snow Labs.", "انا اسمي احمد واعمل في ارامكو"])

val pipeline = new PretrainedPipeline("xlm_roberta_token_classifier_ner_40_lang_pipeline", lang = "xx")

pipeline.annotate(Array("My name is John and I work at John Snow Labs.", "انا اسمي احمد واعمل في ارامكو"))

Results

+--------------+---------+
|chunk         |ner_label|
+--------------+---------+
|John          |PER      |
|John Snow Labs|ORG      |
|احمد          |PER      |
|ارامكو        |ORG      |
+--------------+---------+

Model Information

Model Name:	xlm_roberta_token_classifier_ner_40_lang_pipeline
Type:	pipeline
Compatibility:	Spark NLP 4.0.0+
License:	Open Source
Edition:	Official
Language:	xx
Size:	967.7 MB

Included Models

DocumentAssembler
SentenceDetector
TokenizerModel
XlmRoBertaForTokenClassification
NerConverter
Finisher

PREVIOUSNER Pipeline for 10 African Languages

NEXTEntity Recognizer LG