Getting Started#
Spark NLP for Healthcare is a commercial extension of Spark NLP for clinical and biomedical text mining. If you don’t have a Spark NLP for Healthcare subscription yet, you can ask for a free trial by clicking on the button below.
[Try Free](https://www.johnsnowlabs.com/spark-nlp-try-free/)
Spark NLP for Healthcare provides healthcare-specific annotators, pipelines, models, and embeddings for: - Clinical entity recognition - Clinical Entity Linking - Entity normalization - Assertion Status Detection - De-identification - Relation Extraction - Spell checking & correction - Entity Resolver - Rule Based Contextual Parser - Text Generator - Summarizer - Risk Adjustment Module
The library offers access to several clinical and biomedical transformers: JSL-BERT-Clinical, BioBERT, ClinicalBERT, GloVe-Med, GloVe-ICD-O. It also includes over 2000+ pre-trained healthcare models, that can recognize the following entities (and many more): - Clinical - support Signs, Symptoms, Treatments, Procedures, Tests, Labs, Sections - Drugs - support Name, Dosage, Strength, Route, Duration, Frequency - Risk Factors- support Smoking, Obesity, Diabetes, Hypertension, Substance Abuse - Anatomy - support Organ, Subdivision, Cell, Structure Organism, Tissue, Gene, Chemical - Demographics - support Age, Gender, Height, Weight, Race, Ethnicity, Marital Status, Vital Signs - Sensitive Data- support Patient Name, Address, Phone, Email, Dates, Providers, Identifiers
For more information visit our [models](https://nlp.johnsnowlabs.com/models/) site.
Requirements#
Spark NLP is built on top of Apache Spark 3.x.x. For using Spark NLP you need:
Java 8 or Java 11
Apache Spark
3.x.x
Python
3.7.x
,3.8.x
,3.9.x
, and3.10.x
It is recommended to have basic knowledge of the framework and a working environment before using Spark NLP. Please refer to Spark documentation to get started with Spark.
Installation#
First, let’s make sure the installed java version is Java 8 or 11 (Oracle or OpenJDK):
java -version
# openjdk version "1.8.0_292"
You can install the Spark NLP for Healthcare package by using:
pip install spark-nlp-jsl==${version} --extra-index-url https://pypi.johnsnowlabs.com/${secret.code} --upgrade
{version} is the version part of the {secret.code} ({secret.code}.split(‘-‘)[0]) (i.e. 2.x.x)
The {secret.code} is a secret code that is only available to users with valid/trial license. If you did not receive it yet, please contact us at <a href=”mailto:info@johnsnowlabs.com”>info@johnsnowlabs.com</a>.
Starting a Spark NLP Session from Python#
You can start the spark session with this simple piece of code.
import sparknlp_jsl
spark = sparknlp_jsl.start(secret = "{secret.code}")
Or use the SparkSession module for more flexibility:
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("Spark NLP Enterprise") \
.master("local[*]") \
.config("spark.driver.memory","16") \
.config("spark.driver.maxResultSize", "2G") \
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
.config("spark.kryoserializer.buffer.max", "2000M") \
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:${version_public}") \
.config("spark.jars", "https://pypi.johnsnowlabs.com/${secret.code}/spark-nlp-jsl-${version}.jar") \
.getOrCreate()