High Performance NLP with Apache Spark

John Snow Labs’ NLP is a text processing library built on top of Apache Spark and its Spark ML library. It's goal is to provide easy API for NLP annotations allowing a scalable approach within a distributed large scale environment.

Questions? Join our Slack

2018 Sep 17th - Update! 1.6.3 Released! New DeIdentification annotator, better OCR and multiple bugfixes! Learn changes HERE and check out for updated documentation below

Get started

Quick start guide to setup spark-nlp and get going


Pretrained models, pipelines and other concepts reference


Sample Notebooks, guideline to use SparkNLP


Ways to Contribute to spark-nlp repository

Resources & FAQs

Videos, Podcasts, Whitepapers and other questions

License & Credits

Licensing / Acknowledgements