Detect Persons, Locations, Organizations and Misc Entities - PT (WikiNER 840B 300)

Description

WikiNER is a Named Entity Recognition (or NER) model, meaning it annotates text to find features like the names of people, places, and organizations. This NER model does not read words directly but instead reads word embeddings, which represent words as points such that more semantically similar words are closer together. WikiNER 840B 300 is trained with GloVe 840B 300 word embeddings, so be sure to use the same embeddings in the pipeline.

Predicted Entities

Persons, Locations, Organizations, Misc.

Live Demo Open in Colab Download

How to use


ner = NerDLModel.pretrained("wikiner_840B_300", "pt") \
        .setInputCols(["document", "token", "embeddings"]) \
        .setOutputCol("ner")

val ner = NerDLModel.pretrained("wikiner_840B_300", "pt")
        .setInputCols(Array("document", "token", "embeddings"))
        .setOutputCol("ner")

Model Information

Model Name: wikiner_840B_300
Type: ner
Compatibility: Spark NLP 2.5.0+
Edition: Official
License: Open Source
Input Labels: [sentence, token, embeddings]
Output Labels: [ner]
Language: pt
Case sensitive: false

Data Source

The model was trained based on data from https://pt.wikipedia.org