sparknlp_jsl.annotator.TFGraphBuilder#

class sparknlp_jsl.annotator.TFGraphBuilder[source]#

Bases: Estimator, DefaultParamsWritable, DefaultParamsReadable

Methods

`__init__`()
`clear`(param)	Clears a param from the param map if it has been explicitly set.
`copy`([extra])	Creates a copy of this instance with the same uid and some extra params.
`explainParam`(param)	Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.
`explainParams`()	Returns the documentation of all params with their optionally default values and user-supplied values.
`extractParamMap`([extra])	Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.
`fit`(dataset[, params])	Fits a model to the input dataset with optional parameters.
`fitMultiple`(dataset, paramMaps)	Fits a model to the input dataset for each param map in paramMaps.
`getBatchNorm`()	Batch normalization, used in RelationExtractionApproach.
`getGraphFile`()	Gets the graph file name.
`getGraphFolder`()	Gets the graph folder.
`getHiddenAct`()	Activation function for hidden layers, used in RelationExtractionApproach.
`getHiddenActL2`()	L2 regularization of hidden layer activations, used in RelationExtractionApproach
`getHiddenLayers`()	Gets the list of hiudden layer sizes for RelationExtractionApproach.
`getHiddenUnitsNumber`()	Gets the number of hidden units for AssertionDLApproach and MedicalNerApproach.
`getHiddenWeightsL2`()	L2 regularization of hidden layer weights, used in RelationExtractionApproach
`getInputCols`()	Gets current column names of input annotations.
`getLabelColumn`()	Gets the name of the label column.
`getMaxSequenceLength`()	Gets the maximum sequence length for AssertionDLApproach.
`getModelName`()	Gets the name of the model.
`getOrDefault`(param)	Gets the value of a param in the user-supplied param map or its default value.
`getParam`(paramName)	Gets a param by its name.
`hasDefault`(param)	Checks whether a param has a default value.
`hasParam`(paramName)	Tests whether this instance contains a param with a given (string) name.
`isDefined`(param)	Checks whether a param is explicitly set by user or has a default value.
`isSet`(param)	Checks whether a param is explicitly set by user.
`load`(path)	Reads an ML instance from the input path, a shortcut of read().load(path).
`read`()	Returns a DefaultParamsReader instance for this class.
`save`(path)	Save this ML instance to the given path, a shortcut of 'write().save(path)'.
`set`(param, value)	Sets a parameter in the embedded param map.
`setBatchNorm`(value)	Batch normalization, used in RelationExtractionApproach.
`setGraphFile`(value)	Sets the graph file name.
`setGraphFolder`(value)	Sets folder path that contain external graph files.
`setHiddenAct`(value)	Activation function for hidden layers, used in RelationExtractionApproach.
`setHiddenActL2`(value)	L2 regularization of hidden layer weights, used in RelationExtractionApproach
`setHiddenLayers`(value)	A list of hidden layer sizes for RelationExtractionApproach
`setHiddenUnitsNumber`(value)	Sets the number of hidden units for AssertionDLApproach and MedicalNerApproach
`setHiddenWeightsL2`(value)	L2 regularization of hidden layer weights, used in RelationExtractionApproach
`setInputCols`(*value)	Sets column names of input annotations.
`setLabelColumn`(value)	Sets the name of the column for data labels.
`setMaxSequenceLength`(value)	Sets the maximum sequence length for AssertionDLApproach
`setModelName`(value)	Sets the model name
`write`()	Returns a DefaultParamsWriter instance for this class.

Attributes

`batchNorm`
`graphFile`
`graphFolder`
`hiddenAct`
`hiddenActL2`
`hiddenLayers`
`hiddenUnitsNumber`
`hiddenWeightsL2`
`inputCols`
`labelColumn`
`maxSequenceLength`
`modelName`
`params`	Returns all params ordered by name.

clear(param)#: Clears a param from the param map if it has been explicitly set.

copy(extra=None)#

Creates a copy of this instance with the same uid and some extra params. The default implementation creates a shallow copy using copy.copy(), and then copies the embedded and extra parameters over and returns the copy. Subclasses should override this method if the default approach is not sufficient.

Parameters:: extra – Extra parameters to copy to the new instance
Returns:: Copy of this instance

explainParam(param)#: Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.

explainParams()#: Returns the documentation of all params with their optionally default values and user-supplied values.

extractParamMap(extra=None)#

Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.

Parameters:: extra – extra param values
Returns:: merged param map

fit(dataset, params=None)#

Fits a model to the input dataset with optional parameters.

Parameters:

dataset – input dataset, which is an instance of pyspark.sql.DataFrame
params – an optional param map that overrides embedded params. If a list/tuple of param maps is given, this calls fit on each param map and returns a list of models.

Returns:

fitted model(s)

New in version 1.3.0.

fitMultiple(dataset, paramMaps)#

Fits a model to the input dataset for each param map in paramMaps.

Parameters:

dataset – input dataset, which is an instance of pyspark.sql.DataFrame.
paramMaps – A Sequence of param maps.

Returns:

A thread safe iterable which contains one model for each param map. Each call to next(modelIterator) will return (index, model) where model was fit using paramMaps[index]. index values may not be sequential.

New in version 2.3.0.

getBatchNorm()[source]#: Batch normalization, used in RelationExtractionApproach.

getGraphFile()[source]#: Gets the graph file name.

getGraphFolder()[source]#: Gets the graph folder.

getHiddenAct()[source]#: Activation function for hidden layers, used in RelationExtractionApproach.

getHiddenActL2()[source]#: L2 regularization of hidden layer activations, used in RelationExtractionApproach

getHiddenLayers()[source]#: Gets the list of hiudden layer sizes for RelationExtractionApproach.

getHiddenUnitsNumber()[source]#: Gets the number of hidden units for AssertionDLApproach and MedicalNerApproach.

getHiddenWeightsL2()[source]#: L2 regularization of hidden layer weights, used in RelationExtractionApproach

getInputCols()[source]#: Gets current column names of input annotations.

getLabelColumn()[source]#: Gets the name of the label column.

getMaxSequenceLength()[source]#: Gets the maximum sequence length for AssertionDLApproach.

getModelName()[source]#: Gets the name of the model.

getOrDefault(param)#: Gets the value of a param in the user-supplied param map or its default value. Raises an error if neither is set.

getParam(paramName)#: Gets a param by its name.

hasDefault(param)#: Checks whether a param has a default value.

hasParam(paramName)#: Tests whether this instance contains a param with a given (string) name.

isDefined(param)#: Checks whether a param is explicitly set by user or has a default value.

isSet(param)#: Checks whether a param is explicitly set by user.

classmethod load(path)#: Reads an ML instance from the input path, a shortcut of read().load(path).

property params#: Returns all params ordered by name. The default implementation uses dir() to get all attributes of type Param.

classmethod read()#: Returns a DefaultParamsReader instance for this class.

save(path)#: Save this ML instance to the given path, a shortcut of ‘write().save(path)’.

set(param, value)#: Sets a parameter in the embedded param map.

setBatchNorm(value)[source]#

Batch normalization, used in RelationExtractionApproach.

Parameters:

valueboolean: Batch normalization for RelationExtractionApproach

setGraphFile(value)[source]#

Sets the graph file name.

Parameters:

valuesrt: Greaph file name. If set to “auto”, then the graph builder will use the model specific default graph file name.

setGraphFolder(value)[source]#

Sets folder path that contain external graph files.

Parameters:

valuesrt: Folder path that contain external graph files.

setHiddenAct(value)[source]#

Activation function for hidden layers, used in RelationExtractionApproach.

Parameters:

valuestring: Activation function for hidden layers, used in RelationExtractionApproach. Possible value are: relu, sigmoid, tanh, linear

setHiddenActL2(value)[source]#

L2 regularization of hidden layer weights, used in RelationExtractionApproach

Parameters:

valueboolean: L2 regularization of hidden layer activations, used in RelationExtractionApproach

setHiddenLayers(value)[source]#

A list of hidden layer sizes for RelationExtractionApproach

Parameters:

*valueint: A list of hidden layer sizes for RelationExtractionApproach

setHiddenUnitsNumber(value)[source]#

Sets the number of hidden units for AssertionDLApproach and MedicalNerApproach

Parameters:

valueint: Number of hidden units for AssertionDLApproach and MedicalNerApproach

setHiddenWeightsL2(value)[source]#

L2 regularization of hidden layer weights, used in RelationExtractionApproach

Parameters:

valueboolean: L2 regularization of hidden layer weights, used in RelationExtractionApproach

setInputCols(*value)[source]#

Sets column names of input annotations.

Parameters:

*valuestr: Input columns for the annotator

setLabelColumn(value)[source]#

Sets the name of the column for data labels.

Parameters:

valuestr: Column for data labels

setMaxSequenceLength(value)[source]#

Sets the maximum sequence length for AssertionDLApproach

Parameters:

valueint: Maximum sequence length for AssertionDLApproach

setModelName(value)[source]#

Sets the model name

Parameters:

valuestr: Model name

uid#: A unique id for the object.

write()#: Returns a DefaultParamsWriter instance for this class.