XLM-RoBERTa 40-Language NER Pipeline

Description

This pretrained pipeline is built on the top of xlm_roberta_token_classifier_ner_40_lang model.

Predicted Entities

Download Copy S3 URI

How to use


pipeline = PretrainedPipeline("xlm_roberta_token_classifier_ner_40_lang_pipeline", lang = "xx")

pipeline.annotate(["My name is John and I work at John Snow Labs.", "انا اسمي احمد واعمل في ارامكو"])

val pipeline = new PretrainedPipeline("xlm_roberta_token_classifier_ner_40_lang_pipeline", lang = "xx")

pipeline.annotate(Array("My name is John and I work at John Snow Labs.", "انا اسمي احمد واعمل في ارامكو"))

Results


+--------------+---------+
|chunk         |ner_label|
+--------------+---------+
|John          |PER      |
|John Snow Labs|ORG      |
|احمد          |PER      |
|ارامكو        |ORG      |
+--------------+---------+

Model Information

Model Name: xlm_roberta_token_classifier_ner_40_lang_pipeline
Type: pipeline
Compatibility: Spark NLP 4.0.0+
License: Open Source
Edition: Official
Language: xx
Size: 967.7 MB

Included Models

  • DocumentAssembler
  • SentenceDetector
  • TokenizerModel
  • XlmRoBertaForTokenClassification
  • NerConverter
  • Finisher