sparknlp_jsl.annotator.TFGraphBuilder#

class sparknlp_jsl.annotator.TFGraphBuilder[source]#

Bases: Estimator, DefaultParamsWritable, DefaultParamsReadable

Methods

__init__()

clear(param)

Clears a param from the param map if it has been explicitly set.

copy([extra])

Creates a copy of this instance with the same uid and some extra params.

explainParam(param)

Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.

explainParams()

Returns the documentation of all params with their optionally default values and user-supplied values.

extractParamMap([extra])

Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.

fit(dataset[, params])

Fits a model to the input dataset with optional parameters.

fitMultiple(dataset, paramMaps)

Fits a model to the input dataset for each param map in paramMaps.

getBatchNorm()

Batch normalization, used in RelationExtractionApproach.

getGraphFile()

Gets the graph file name.

getGraphFolder()

Gets the graph folder.

getHiddenAct()

Activation function for hidden layers, used in RelationExtractionApproach.

getHiddenActL2()

L2 regularization of hidden layer activations, used in RelationExtractionApproach

getHiddenLayers()

Gets the list of hiudden layer sizes for RelationExtractionApproach.

getHiddenUnitsNumber()

Gets the number of hidden units for AssertionDLApproach and MedicalNerApproach.

getHiddenWeightsL2()

L2 regularization of hidden layer weights, used in RelationExtractionApproach

getInputCols()

Gets current column names of input annotations.

getLabelColumn()

Gets the name of the label column.

getMaxSequenceLength()

Gets the maximum sequence length for AssertionDLApproach.

getModelName()

Gets the name of the model.

getOrDefault(param)

Gets the value of a param in the user-supplied param map or its default value.

getParam(paramName)

Gets a param by its name.

hasDefault(param)

Checks whether a param has a default value.

hasParam(paramName)

Tests whether this instance contains a param with a given (string) name.

isDefined(param)

Checks whether a param is explicitly set by user or has a default value.

isSet(param)

Checks whether a param is explicitly set by user.

load(path)

Reads an ML instance from the input path, a shortcut of read().load(path).

read()

Returns a DefaultParamsReader instance for this class.

save(path)

Save this ML instance to the given path, a shortcut of 'write().save(path)'.

set(param, value)

Sets a parameter in the embedded param map.

setBatchNorm(value)

Batch normalization, used in RelationExtractionApproach.

setGraphFile(value)

Sets the graph file name.

setGraphFolder(value)

Sets folder path that contain external graph files.

setHiddenAct(value)

Activation function for hidden layers, used in RelationExtractionApproach.

setHiddenActL2(value)

L2 regularization of hidden layer weights, used in RelationExtractionApproach

setHiddenLayers(value)

A list of hidden layer sizes for RelationExtractionApproach

setHiddenUnitsNumber(value)

Sets the number of hidden units for AssertionDLApproach and MedicalNerApproach

setHiddenWeightsL2(value)

L2 regularization of hidden layer weights, used in RelationExtractionApproach

setInputCols(*value)

Sets column names of input annotations.

setLabelColumn(value)

Sets the name of the column for data labels.

setMaxSequenceLength(value)

Sets the maximum sequence length for AssertionDLApproach

setModelName(value)

Sets the model name

write()

Returns a DefaultParamsWriter instance for this class.

Attributes

batchNorm

graphFile

graphFolder

hiddenAct

hiddenActL2

hiddenLayers

hiddenUnitsNumber

hiddenWeightsL2

inputCols

labelColumn

maxSequenceLength

modelName

params

Returns all params ordered by name.

clear(param)#

Clears a param from the param map if it has been explicitly set.

copy(extra=None)#

Creates a copy of this instance with the same uid and some extra params. The default implementation creates a shallow copy using copy.copy(), and then copies the embedded and extra parameters over and returns the copy. Subclasses should override this method if the default approach is not sufficient.

Parameters:

extra – Extra parameters to copy to the new instance

Returns:

Copy of this instance

explainParam(param)#

Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.

explainParams()#

Returns the documentation of all params with their optionally default values and user-supplied values.

extractParamMap(extra=None)#

Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.

Parameters:

extra – extra param values

Returns:

merged param map

fit(dataset, params=None)#

Fits a model to the input dataset with optional parameters.

Parameters:
  • dataset – input dataset, which is an instance of pyspark.sql.DataFrame

  • params – an optional param map that overrides embedded params. If a list/tuple of param maps is given, this calls fit on each param map and returns a list of models.

Returns:

fitted model(s)

New in version 1.3.0.

fitMultiple(dataset, paramMaps)#

Fits a model to the input dataset for each param map in paramMaps.

Parameters:
  • dataset – input dataset, which is an instance of pyspark.sql.DataFrame.

  • paramMaps – A Sequence of param maps.

Returns:

A thread safe iterable which contains one model for each param map. Each call to next(modelIterator) will return (index, model) where model was fit using paramMaps[index]. index values may not be sequential.

New in version 2.3.0.

getBatchNorm()[source]#

Batch normalization, used in RelationExtractionApproach.

getGraphFile()[source]#

Gets the graph file name.

getGraphFolder()[source]#

Gets the graph folder.

getHiddenAct()[source]#

Activation function for hidden layers, used in RelationExtractionApproach.

getHiddenActL2()[source]#

L2 regularization of hidden layer activations, used in RelationExtractionApproach

getHiddenLayers()[source]#

Gets the list of hiudden layer sizes for RelationExtractionApproach.

getHiddenUnitsNumber()[source]#

Gets the number of hidden units for AssertionDLApproach and MedicalNerApproach.

getHiddenWeightsL2()[source]#

L2 regularization of hidden layer weights, used in RelationExtractionApproach

getInputCols()[source]#

Gets current column names of input annotations.

getLabelColumn()[source]#

Gets the name of the label column.

getMaxSequenceLength()[source]#

Gets the maximum sequence length for AssertionDLApproach.

getModelName()[source]#

Gets the name of the model.

getOrDefault(param)#

Gets the value of a param in the user-supplied param map or its default value. Raises an error if neither is set.

getParam(paramName)#

Gets a param by its name.

hasDefault(param)#

Checks whether a param has a default value.

hasParam(paramName)#

Tests whether this instance contains a param with a given (string) name.

isDefined(param)#

Checks whether a param is explicitly set by user or has a default value.

isSet(param)#

Checks whether a param is explicitly set by user.

classmethod load(path)#

Reads an ML instance from the input path, a shortcut of read().load(path).

property params#

Returns all params ordered by name. The default implementation uses dir() to get all attributes of type Param.

classmethod read()#

Returns a DefaultParamsReader instance for this class.

save(path)#

Save this ML instance to the given path, a shortcut of ‘write().save(path)’.

set(param, value)#

Sets a parameter in the embedded param map.

setBatchNorm(value)[source]#

Batch normalization, used in RelationExtractionApproach.

Parameters:
valueboolean

Batch normalization for RelationExtractionApproach

setGraphFile(value)[source]#

Sets the graph file name.

Parameters:
valuesrt

Greaph file name. If set to “auto”, then the graph builder will use the model specific default graph file name.

setGraphFolder(value)[source]#

Sets folder path that contain external graph files.

Parameters:
valuesrt

Folder path that contain external graph files.

setHiddenAct(value)[source]#

Activation function for hidden layers, used in RelationExtractionApproach.

Parameters:
valuestring

Activation function for hidden layers, used in RelationExtractionApproach. Possible value are: relu, sigmoid, tanh, linear

setHiddenActL2(value)[source]#

L2 regularization of hidden layer weights, used in RelationExtractionApproach

Parameters:
valueboolean

L2 regularization of hidden layer activations, used in RelationExtractionApproach

setHiddenLayers(value)[source]#

A list of hidden layer sizes for RelationExtractionApproach

Parameters:
*valueint

A list of hidden layer sizes for RelationExtractionApproach

setHiddenUnitsNumber(value)[source]#

Sets the number of hidden units for AssertionDLApproach and MedicalNerApproach

Parameters:
valueint

Number of hidden units for AssertionDLApproach and MedicalNerApproach

setHiddenWeightsL2(value)[source]#

L2 regularization of hidden layer weights, used in RelationExtractionApproach

Parameters:
valueboolean

L2 regularization of hidden layer weights, used in RelationExtractionApproach

setInputCols(*value)[source]#

Sets column names of input annotations.

Parameters:
*valuestr

Input columns for the annotator

setLabelColumn(value)[source]#

Sets the name of the column for data labels.

Parameters:
valuestr

Column for data labels

setMaxSequenceLength(value)[source]#

Sets the maximum sequence length for AssertionDLApproach

Parameters:
valueint

Maximum sequence length for AssertionDLApproach

setModelName(value)[source]#

Sets the model name

Parameters:
valuestr

Model name

uid#

A unique id for the object.

write()#

Returns a DefaultParamsWriter instance for this class.