High Performance NLP with Apache Spark

John Snow Labs’ NLP is a text processing library built on top of Apache Spark and its Spark ML library. It's goal is to provide easy API for NLP annotations allowing a scalable approach within a distributed large scale environment.

Questions? Join our Slack

2018 Jul 7th - Update! 1.6.0 Released! OCR PDF to Spark-NLP capabilities, new Chunker annotator, fixed AWS compatibility, better performance and much more. Learn changes HERE and check out for updated documentation below

Get started

Quick start guide to setup spark-nlp and get going


Pretrained models, pipelines and other concepts reference


Sample Notebooks, guideline to use SparkNLP


Ways to Contribute to spark-nlp repository

Resources & FAQs

Videos, Podcasts, Whitepapers and other questions

License & Credits

Licensing / Acknowledgements