Description
A Part of Speech classifier predicts a grammatical label for every token in the input text. Implemented with an averaged perceptron architecture
. This model was trained on additional medical data.
Predicted Entities
- PROPN
- PUNCT
- ADJ
- NOUN
- VERB
- DET
- ADP
- AUX
- PRON
- PART
- SCONJ
- NUM
- ADV
- CCONJ
- X
- INTJ
- SYM
Live Demo Open in Colab Copy S3 URI
How to use
document_assembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")
tokenizer = new Tokenizer().setInputCols("document").setOutputCol("token")
pos = PerceptronModel.pretrained("pos_clinical","en","clinical/models").setInputCols("token","document")
pipeline = Pipeline(stages=[document_assembler, tokenizer, pos])
df = spark.createDataFrame([['POS assigns each token in a sentence a grammatical label']], ["text"])
result = pipeline.fit(df).transform(df)
result.select("pos.result").show(false)
val document_assembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")
val tokenizer = new Tokenizer().setInputCols(Array("document")).setOutputCol("token")
val pos = PerceptronModel.pretrained("pos_clinical","en","clinical/models").setInputCols("token","document")
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, pos))
val df = Seq("POS assigns each token in a sentence a grammatical label").toDF("text")
val result = pipeline.fit(df).transform(df)
result.select("pos.result").show(false)
nlu.load('pos.clinical').predict("POS assigns each token in a sentence a grammatical label")
Results
+------------------------------------------+
|result |
+------------------------------------------+
|[NN, NNS, PND, NN, II, DD, NN, DD, JJ, NN]|
+------------------------------------------+
Model Information
Model Name: | pos_clinical |
Compatibility: | Spark NLP 3.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [pos] |
Language: | en |