c
com.johnsnowlabs.nlp.annotators.ner
PretrainedZeroShotNERChunker
Companion object PretrainedZeroShotNERChunker
class PretrainedZeroShotNERChunker extends PretrainedZeroShotNER
A fine-tuned zero-shot named-entity recognition (NER) model. Performs NER on arbitrary text without task-specific labeled training.
In contrast to PretrainedZeroShotNER this annotator directly outputs NER chunks instead of aligning them to provided tokens.
Example
val text = """ |Cristiano Ronaldo dos Santos Aveiro (Portuguese pronunciation: [kɾiʃˈtjɐnu ʁɔˈnaldu]; born 5 February |1985) is a Portuguese professional footballer who plays as a forward for and captains both Saudi Pro |League club Al Nassr and the Portugal national team. Widely regarded as one of the greatest players of |all time, Ronaldo has won five Ballon d'Or awards,[note 3] a record three UEFA Men's Player of the Year |Awards, and four European Golden Shoes, the most by a European player. """.stripMargin val testData = Seq(text).toDF("text") val documentAssembler = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") val sentenceDetector = new SentenceDetector() .setInputCols(Array("document")) .setOutputCol("sentence") val ner = PretrainedZeroShotNERChunker .pretrained() .setInputCols(Array("sentence")) .setOutputCol("ner_chunk") .setLabels(Array("person", "award", "date", "competitions", "teams")) val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDetector, ner)) val results = pipeline.fit(testData).transform(testData) results.selectExpr("explode(entity)").show(1000, truncate = false)
Results:
+--------------------------------------------------------------------------------------------------------------------------------------------------+
|col |
+--------------------------------------------------------------------------------------------------------------------------------------------------+
|{chunk, 2, 37, Cristiano Ronaldo dos Santos Aveiro, {sentence -> 0, entity -> person, confidence -> 0.9144007, ner_source -> ner_chunk}, []} |
|{chunk, 93, 109, 5 February\r\n1985, {sentence -> 1, entity -> date, confidence -> 0.99999976, ner_source -> ner_chunk}, []} |
|{chunk, 196, 213, Saudi Pro\r\nLeague, {sentence -> 1, entity -> competitions, confidence -> 0.9926515, ner_source -> ner_chunk}, []} |
|{chunk, 219, 227, Al Nassr, {sentence -> 1, entity -> teams, confidence -> 0.99384415, ner_source -> ner_chunk}, []} |
|{chunk, 321, 328, Ronaldo, {sentence -> 2, entity -> person, confidence -> 0.999997, ner_source -> ner_chunk}, []} |
|{chunk, 342, 353, Ballon d'Or, {sentence -> 2, entity -> award, confidence -> 0.95896983, ner_source -> ner_chunk}, []} |
|{chunk, 385, 422, UEFA Men's Player of the Year\r\nAwards, {sentence -> 2, entity -> award, confidence -> 0.9687164, ner_source -> ner_chunk}, []}|
|{chunk, 433, 454, European Golden Shoes, {sentence -> 2, entity -> award, confidence -> 0.999326, ner_source -> ner_chunk}, []} |
+--------------------------------------------------------------------------------------------------------------------------------------------------+- See also
Linear Supertypes
Ordering
- Grouped
- Alphabetic
- By Inheritance
Inherited
- PretrainedZeroShotNERChunker
- PretrainedZeroShotNER
- CheckLicense
- HasEngine
- WriteOpenvinoModel
- WriteSentencePieceModel
- InternalWriteOnnxModel
- HasBatchedAnnotate
- AnnotatorModel
- CanBeLazy
- RawAnnotator
- HasOutputAnnotationCol
- HasInputAnnotationCols
- HasOutputAnnotatorType
- ParamsAndFeaturesWritable
- HasFeatures
- DefaultParamsWritable
- MLWritable
- Model
- Transformer
- PipelineStage
- Logging
- Params
- Serializable
- Serializable
- Identifiable
- AnyRef
- Any
- Hide All
- Show All
Visibility
- Public
- All
Instance Constructors
Type Members
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
$[T](param: Param[T]): T
- Attributes
- protected
- Definition Classes
- Params
-
def
$$[T](feature: StructFeature[T]): T
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[K, V](feature: MapFeature[K, V]): Map[K, V]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[T](feature: SetFeature[T]): Set[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
$$[T](feature: ArrayFeature[T]): Array[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
_transform(dataset: Dataset[_], recursivePipeline: Option[PipelineModel]): DataFrame
- Attributes
- protected
- Definition Classes
- AnnotatorModel
-
def
afterAnnotate(dataset: DataFrame): DataFrame
- Attributes
- protected
- Definition Classes
- AnnotatorModel
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
batchAnnotate(batchedAnnotations: Seq[Array[Annotation]]): Seq[Seq[Annotation]]
- Definition Classes
- PretrainedZeroShotNERChunker → PretrainedZeroShotNER → HasBatchedAnnotate
-
def
batchProcess(rows: Iterator[_]): Iterator[Row]
- Definition Classes
- HasBatchedAnnotate
-
val
batchSize: IntParam
- Definition Classes
- HasBatchedAnnotate
-
def
beforeAnnotate(dataset: Dataset[_]): Dataset[_]
- Attributes
- protected
- Definition Classes
- AnnotatorModel
-
final
def
checkSchema(schema: StructType, inputAnnotatorType: String): Boolean
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
def
checkValidEnvironment(spark: Option[SparkSession], scopes: Seq[String], metadata: Option[Map[String, String]]): Unit
- Definition Classes
- CheckLicense
-
def
checkValidScope(scope: String): Unit
- Definition Classes
- CheckLicense
-
def
checkValidScopeAndEnvironment(scope: String, spark: Option[SparkSession], checkLp: Boolean, metadata: Option[Map[String, String]]): Unit
- Definition Classes
- CheckLicense
-
def
checkValidScopesAndEnvironment(scopes: Seq[String], spark: Option[SparkSession], checkLp: Boolean, metadata: Option[Map[String, String]]): Unit
- Definition Classes
- CheckLicense
-
final
def
clear(param: Param[_]): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- Params
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
copy(extra: ParamMap): PretrainedZeroShotNER
- Definition Classes
- RawAnnotator → Model → Transformer → PipelineStage → Params
-
def
copyValues[T <: Params](to: T, extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
final
def
defaultCopy[T <: Params](extra: ParamMap): T
- Attributes
- protected
- Definition Classes
- Params
-
val
engine: Param[String]
- Definition Classes
- HasEngine
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
explainParam(param: Param[_]): String
- Definition Classes
- Params
-
def
explainParams(): String
- Definition Classes
- Params
-
def
extraValidate(structType: StructType): Boolean
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
def
extraValidateMsg: String
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
final
def
extractParamMap(): ParamMap
- Definition Classes
- Params
-
final
def
extractParamMap(extra: ParamMap): ParamMap
- Definition Classes
- Params
-
val
features: ArrayBuffer[Feature[_, _, _]]
- Definition Classes
- HasFeatures
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
get[T](feature: StructFeature[T]): Option[T]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[K, V](feature: MapFeature[K, V]): Option[Map[K, V]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[T](feature: SetFeature[T]): Option[Set[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
get[T](feature: ArrayFeature[T]): Option[Array[T]]
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
get[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
def
getBatchSize: Int
- Definition Classes
- HasBatchedAnnotate
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
getDefault[T](param: Param[T]): Option[T]
- Definition Classes
- Params
-
def
getEngine: String
- Definition Classes
- HasEngine
-
def
getInputCols: Array[String]
- Definition Classes
- HasInputAnnotationCols
-
def
getLabels: Array[String]
- Definition Classes
- PretrainedZeroShotNER
-
def
getLazyAnnotator: Boolean
- Definition Classes
- CanBeLazy
-
def
getModelIfNotSet: GlinerModel
- Definition Classes
- PretrainedZeroShotNER
-
final
def
getOrDefault[T](param: Param[T]): T
- Definition Classes
- Params
-
final
def
getOutputCol: String
- Definition Classes
- HasOutputAnnotationCol
-
def
getParam(paramName: String): Param[Any]
- Definition Classes
- Params
-
def
getPredictionThreshold: Float
- Definition Classes
- PretrainedZeroShotNER
-
final
def
hasDefault[T](param: Param[T]): Boolean
- Definition Classes
- Params
-
def
hasParam(paramName: String): Boolean
- Definition Classes
- Params
-
def
hasParent: Boolean
- Definition Classes
- Model
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
val
inputAnnotatorTypes: Array[String]
Input Annotator Types: DOCUMENT
Input Annotator Types: DOCUMENT
- Definition Classes
- PretrainedZeroShotNERChunker → PretrainedZeroShotNER → HasInputAnnotationCols
-
final
val
inputCols: StringArrayParam
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
isDefined(param: Param[_]): Boolean
- Definition Classes
- Params
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
isSet(param: Param[_]): Boolean
- Definition Classes
- Params
-
def
isTokenInEntity(token: Annotation, sentence: Int, begin: Int, end: Int): Boolean
- Definition Classes
- PretrainedZeroShotNER
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
var
labels: StringArrayParam
List of entity labels
List of entity labels
- Definition Classes
- PretrainedZeroShotNER
-
val
lazyAnnotator: BooleanParam
- Definition Classes
- CanBeLazy
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
msgHelper(schema: StructType): String
- Attributes
- protected
- Definition Classes
- HasInputAnnotationCols
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
onWrite(path: String, spark: SparkSession): Unit
- Definition Classes
- PretrainedZeroShotNER → ParamsAndFeaturesWritable
-
val
optionalInputAnnotatorTypes: Array[String]
- Definition Classes
- HasInputAnnotationCols
-
val
outputAnnotatorType: String
Output Annotator Types: chunk
Output Annotator Types: chunk
- Definition Classes
- PretrainedZeroShotNERChunker → PretrainedZeroShotNER → HasOutputAnnotatorType
-
final
val
outputCol: Param[String]
- Attributes
- protected
- Definition Classes
- HasOutputAnnotationCol
-
lazy val
params: Array[Param[_]]
- Definition Classes
- Params
-
var
parent: Estimator[PretrainedZeroShotNER]
- Definition Classes
- Model
-
var
predictionThreshold: FloatParam
Entity recognition threshold
Entity recognition threshold
- Definition Classes
- PretrainedZeroShotNER
-
def
save(path: String): Unit
- Definition Classes
- MLWritable
- Annotations
- @Since( "1.6.0" ) @throws( ... )
-
def
set[T](feature: StructFeature[T], value: T): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[K, V](feature: MapFeature[K, V], value: Map[K, V]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[T](feature: SetFeature[T], value: Set[T]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
set[T](feature: ArrayFeature[T], value: Array[T]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
set(paramPair: ParamPair[_]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set(param: String, value: Any): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
set[T](param: Param[T], value: T): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- Params
-
def
setBatchSize(size: Int): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- HasBatchedAnnotate
-
def
setDefault[T](feature: StructFeature[T], value: () ⇒ T): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[K, V](feature: MapFeature[K, V], value: () ⇒ Map[K, V]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[T](feature: SetFeature[T], value: () ⇒ Set[T]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
def
setDefault[T](feature: ArrayFeature[T], value: () ⇒ Array[T]): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- HasFeatures
-
final
def
setDefault(paramPairs: ParamPair[_]*): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected
- Definition Classes
- Params
-
final
def
setDefault[T](param: Param[T], value: T): PretrainedZeroShotNERChunker.this.type
- Attributes
- protected[org.apache.spark.ml]
- Definition Classes
- Params
-
final
def
setInputCols(value: String*): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- HasInputAnnotationCols
-
def
setInputCols(value: Array[String]): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- HasInputAnnotationCols
-
def
setLabels(labels: Array[String]): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- PretrainedZeroShotNER
-
def
setLazyAnnotator(value: Boolean): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- CanBeLazy
-
def
setModelIfNotSet(spark: SparkSession, openvinoWrapper: OpenvinoWrapper, spp: SentencePieceWrapper, glinerConfig: Option[GlinerConfig]): PretrainedZeroShotNER
- Definition Classes
- PretrainedZeroShotNER
-
def
setModelIfNotSet(spark: SparkSession, onnxWrapper: InternalOnnxWrapper, spp: SentencePieceWrapper, glinerConfig: Option[GlinerConfig] = None): PretrainedZeroShotNER
- Definition Classes
- PretrainedZeroShotNER
-
final
def
setOutputCol(value: String): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- HasOutputAnnotationCol
-
def
setParent(parent: Estimator[PretrainedZeroShotNER]): PretrainedZeroShotNER
- Definition Classes
- Model
-
def
setPredictionThreshold(value: Float): PretrainedZeroShotNERChunker.this.type
- Definition Classes
- PretrainedZeroShotNER
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- Identifiable → AnyRef → Any
-
final
def
transform(dataset: Dataset[_]): DataFrame
- Definition Classes
- AnnotatorModel → Transformer
-
def
transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
- Definition Classes
- Transformer
- Annotations
- @Since( "2.0.0" )
-
def
transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
- Definition Classes
- Transformer
- Annotations
- @Since( "2.0.0" ) @varargs()
-
final
def
transformSchema(schema: StructType): StructType
- Definition Classes
- RawAnnotator → PipelineStage
-
def
transformSchema(schema: StructType, logging: Boolean): StructType
- Attributes
- protected
- Definition Classes
- PipelineStage
- Annotations
- @DeveloperApi()
-
val
uid: String
- Definition Classes
- PretrainedZeroShotNERChunker → PretrainedZeroShotNER → Identifiable
-
def
validate(schema: StructType): Boolean
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
wrapColumnMetadata(col: Column): Column
- Attributes
- protected
- Definition Classes
- RawAnnotator
-
def
write: MLWriter
- Definition Classes
- ParamsAndFeaturesWritable → DefaultParamsWritable → MLWritable
-
def
writeOnnxModel(path: String, spark: SparkSession, onnxWrapper: InternalOnnxWrapper, suffix: String, fileName: String, encrypt: Boolean): Unit
- Definition Classes
- InternalWriteOnnxModel
-
def
writeOnnxModel(path: String, spark: SparkSession, onnxWrapper: InternalOnnxWrapper, suffix: String, fileName: String): Unit
- Definition Classes
- InternalWriteOnnxModel
-
def
writeOnnxModels(path: String, spark: SparkSession, onnxWrappersWithNames: Seq[(InternalOnnxWrapper, String)], suffix: String, encrypt: Boolean = false): Unit
- Definition Classes
- InternalWriteOnnxModel
-
def
writeOnnxModels(path: String, spark: SparkSession, onnxWrappersWithNames: Seq[(InternalOnnxWrapper, String)], suffix: String): Unit
- Definition Classes
- InternalWriteOnnxModel
-
def
writeOpenvinoModel(path: String, spark: SparkSession, openvinoWrapper: OpenvinoWrapper, suffix: String, fileName: String): Unit
- Definition Classes
- WriteOpenvinoModel
-
def
writeOpenvinoModels(path: String, spark: SparkSession, ovWrappersWithNames: Seq[(OpenvinoWrapper, String)], suffix: String): Unit
- Definition Classes
- WriteOpenvinoModel
-
def
writeSentencePieceModel(path: String, spark: SparkSession, spp: SentencePieceWrapper, suffix: String, filename: String): Unit
- Definition Classes
- WriteSentencePieceModel
Inherited from PretrainedZeroShotNER
Inherited from CheckLicense
Inherited from HasEngine
Inherited from WriteOpenvinoModel
Inherited from WriteSentencePieceModel
Inherited from InternalWriteOnnxModel
Inherited from HasBatchedAnnotate[PretrainedZeroShotNER]
Inherited from AnnotatorModel[PretrainedZeroShotNER]
Inherited from CanBeLazy
Inherited from RawAnnotator[PretrainedZeroShotNER]
Inherited from HasOutputAnnotationCol
Inherited from HasInputAnnotationCols
Inherited from HasOutputAnnotatorType
Inherited from ParamsAndFeaturesWritable
Inherited from HasFeatures
Inherited from DefaultParamsWritable
Inherited from MLWritable
Inherited from Model[PretrainedZeroShotNER]
Inherited from Transformer
Inherited from PipelineStage
Inherited from Logging
Inherited from Params
Inherited from Serializable
Inherited from Serializable
Inherited from Identifiable
Inherited from AnyRef
Inherited from Any
Parameters
A list of (hyper-)parameter keys this annotator can take. Users can set and get the parameter values through setters and getters, respectively.
Annotator types
Required input and expected output annotator types