sparknlp_jsl.custom_transformer
#
Module Contents#
Classes#
Helper class to create custom transformers. |
- class CustomTransformer(f)#
Bases:
pyspark.ml.Transformer
,pyspark.ml.param.shared.HasInputCol
,pyspark.ml.param.shared.HasOutputCol
,pyspark.ml.util.DefaultParamsReadable
,pyspark.ml.util.DefaultParamsWritable
Helper class to create custom transformers.
- f#
- inputCol :Param[str]#
- outputCol :Param[str]#
- uid = ''#
- clear(param: Param) None #
Clears a param from the param map if it has been explicitly set.
- copy(extra: pyspark.ml._typing.ParamMap | None = None) P #
Creates a copy of this instance with the same uid and some extra params. The default implementation creates a shallow copy using
copy.copy()
, and then copies the embedded and extra parameters over and returns the copy. Subclasses should override this method if the default approach is not sufficient.- Parameters:
extra (dict, optional) – Extra parameters to copy to the new instance
- Returns:
Copy of this instance
- Return type:
Params
- explainParam(param: str | Param) str #
Explains a single param and returns its name, doc, and optional default value and user-supplied value in a string.
- explainParams() str #
Returns the documentation of all params with their optionally default values and user-supplied values.
- extractParamMap(extra: pyspark.ml._typing.ParamMap | None = None) pyspark.ml._typing.ParamMap #
Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.
- Parameters:
extra (dict, optional) – extra param values
- Returns:
merged param map
- Return type:
dict
- getInputCol() str #
Gets the value of inputCol or its default value.
- getOrDefault(param: str) Any #
- getOrDefault(param: Param[T]) T
Gets the value of a param in the user-supplied param map or its default value. Raises an error if neither is set.
- getOutputCol() str #
Gets the value of outputCol or its default value.
- getParam(paramName: str) Param #
Gets a param by its name.
- hasDefault(param: str | Param[Any]) bool #
Checks whether a param has a default value.
- hasParam(paramName: str) bool #
Tests whether this instance contains a param with a given (string) name.
- isDefined(param: str | Param[Any]) bool #
Checks whether a param is explicitly set by user or has a default value.
- isSet(param: str | Param[Any]) bool #
Checks whether a param is explicitly set by user.
- classmethod load(path: str) RL #
Reads an ML instance from the input path, a shortcut of read().load(path).
- classmethod read() DefaultParamsReader[RL] #
Returns a DefaultParamsReader instance for this class.
- save(path: str) None #
Save this ML instance to the given path, a shortcut of ‘write().save(path)’.
- set(param: Param, value: Any) None #
Sets a parameter in the embedded param map.
- transform(dataset: pyspark.sql.dataframe.DataFrame, params: pyspark.ml._typing.ParamMap | None = None) pyspark.sql.dataframe.DataFrame #
Transforms the input dataset with optional parameters.
New in version 1.3.0.
- Parameters:
dataset (
pyspark.sql.DataFrame
) – input datasetparams (dict, optional) – an optional param map that overrides embedded params.
- Returns:
transformed dataset
- Return type:
- write() MLWriter #
Returns a DefaultParamsWriter instance for this class.