sparknlp_jsl#

Functions

get_credentials(spark)

Gets John Snow Labs credentials

library_settings(spark)

Gets the library settings

load_license_validator()

pub_version()

Gets the public version of Spark NLP

start(secret[, gpu, apple_silicon, aarch64, ...])

Starts a SparkSession with default parameters for Spark NLP Licensed

version()

Gets the version of Spark NLP

get_credentials(spark)#

Gets John Snow Labs credentials

Parameters:

spark (SparkSession) – SparkSession

Returns:

(secretKey, keyId, token)

Return type:

tuple

library_settings(spark)#

Gets the library settings

Parameters:

spark (SparkSession) – SparkSession

Returns:

Library settings

Return type:

str

pub_version()#

Gets the public version of Spark NLP

Returns:

Public version of Spark NLP

Return type:

str

start(secret: str, gpu: bool = False, apple_silicon: bool = False, aarch64=False, public: str = '', params: dict | None = None)#

Starts a SparkSession with default parameters for Spark NLP Licensed

The default parameters would result in the equivalent of:

SparkSession.builder \
    .appName("Spark NLP Licensed") \
    .master("local[*]") \
    .config("spark.driver.memory", "{{available memory}}") \
    .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
    .config("spark.kryoserializer.buffer.max", "2000M") \
    .config("spark.driver.maxResultSize", "0") \
    .config("spark.files.overwrite", "true") \
    .config("spark.extraListeners", "com.johnsnowlabs.license.LicenseLifeCycleManager") \
    .config("spark.jars", "https://pypi.johnsnowlabs.com/|secret|/spark-nlp-jsl-|release|.jar") \
    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:|release|") \
    .getOrCreate()
Parameters:
  • secret (str) – Your secret key

  • gpu (bool) – Whether to use GPU or not

  • apple_silicon (bool) – Whether to use M1 or not

  • aarch64 (bool) – Whether to use aarch64 or not

  • public (str) – Spark NLP version

  • params (dict) – SparkSession params

Notes :

spark.driver.memory is set to the available memory.

Returns:

SparkSession with Spark NLP Licensed

Return type:

SparkSession

version()#

Gets the version of Spark NLP

Returns:

Version of Spark NLP

Return type:

str