Clean Slang in Texts

Description

The clean_slang is a pretrained pipeline that we can use to process text with a simple pipeline that performs basic processing steps and recognizes entities . It performs most of the common text processing tasks on your dataframe.

Predicted Entities

Download Copy S3 URI

How to use


pipeline = PretrainedPipeline('clean_slang', lang='en')

testDoc = '''
yo, what is wrong with ya?
'''

val pipeline = new PretrainedPipeline("clean_slang", lang = "en")
val result = pipeline.fullAnnotate("Hello from John Snow Labs ! ")(0)

import nlu
text = [""Hello from John Snow Labs ! ""]
result_df = nlu.load('en.clean.slang').predict(text)
result_df

Results


['hey', 'what', 'is', 'wrong', 'with', 'you']

Model Information

Model Name: clean_slang
Type: pipeline
Compatibility: Spark NLP 4.0.0+
License: Open Source
Edition: Official
Language: en
Size: 32.1 KB

Included Models

  • DocumentAssembler
  • TokenizerModel
  • NormalizerModel