Third Party Projects

 

There are third party projects that can integrate with Spark NLP. These packages need to be installed separately to be used.

If you’d like to integrate your application with Spark NLP, please send us a message!

Logging

Comet

Comet is a meta machine learning platform designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining the machine learning model lifecycle. By leveraging Comet, users can track, compare, explain and reproduce their machine learning experiments.

Comet can easily integrated into the Spark NLP workflow with the a dedicated logging class CometLogger to log training and evaluation metrics, pipeline parameters and NER visualization made with sparknlp-display.

For more information see the User Guide and for more examples see the Spark NLP Workshop.

Python API: CometLogger
Show Example
# Metrics while training an annotator can be logged with for example:

import sparknlp
from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp.logging.comet import CometLogger

spark = sparknlp.start()

OUTPUT_LOG_PATH = "./run"
logger = CometLogger()

document = DocumentAssembler().setInputCol("text").setOutputCol("document")
embds = (
    UniversalSentenceEncoder.pretrained()
    .setInputCols("document")
    .setOutputCol("sentence_embeddings")
)
multiClassifier = (
    MultiClassifierDLApproach()
    .setInputCols("sentence_embeddings")
    .setOutputCol("category")
    .setLabelColumn("labels")
    .setBatchSize(128)
    .setLr(1e-3)
    .setThreshold(0.5)
    .setShufflePerEpoch(False)
    .setEnableOutputLogs(True)
    .setOutputLogsPath(OUTPUT_LOG_PATH)
    .setMaxEpochs(1)
)

logger.monitor(logdir=OUTPUT_LOG_PATH, model=multiClassifier)
trainDataset = spark.createDataFrame(
    [("Nice.", ["positive"]), ("That's bad.", ["negative"])],
    schema=["text", "labels"],
)

pipeline = Pipeline(stages=[document, embds, multiClassifier])
pipeline.fit(trainDataset)
logger.end()

# If you are using a jupyter notebook, it is possible to display the live web
# interface with

logger.experiment.display(tab='charts')
Last updated