High Performance NLP with Apache Spark

John Snow Labs’ NLP is a text processing library built on top of Apache Spark and its Spark ML library. It's goal is to provide easy API for NLP annotations allowing a scalable approach within a distributed large scale environment.

Questions? Join our Slack

2018 Nov 11st - Update! 1.7.3 Released! Word embeddings decoupled from annotators, better Windows and improved cluster support

Apache Spark 2.4.x not yet supported

Get Started

Quick start guide to setup spark-nlp and get going


Pretrained models, pipelines and other concepts reference

Commercial Support

Go production with enterprise-grade reliability, security & scale


Jupyter notebooks and scala examples


Benchmarks, articles, blog-posts and FAQ


Conference talks, tutorials and podcasts