Healthcare NLP v2.5.0 Release Notes

2.5.0

Overview

We are happy to bring you Spark NLP for Healthcare 2.5.0 with new Annotators, Models and Data Readers. Model composition and iteration is now faster with readers and annotators designed for real world tasks. We introduce ChunkMerge annotator to combine all CHUNKS extracted by different Entity Extraction Annotators. We also introduce an Annotation Reader for JSL AI Platform’s Annotation Tool. This release is also the first one to support the models: ner_large_clinical, ner_events_clinical, assertion_dl_large, chunkresolve_loinc_clinical, deidentify_large And of course we have fixed some bugs.

New Features

AnnotationToolJsonReader is a new class that imports a JSON from AI Platform’s Annotation Tool an generates NER and Assertion training datasets
ChunkMerge Annotator is a new functionality that merges two columns of CHUNKs handling overlaps with a very straightforward logic: max coverage, max # entities
ChunkMerge Annotator handles inputs from NerDLModel, RegexMatcher, ContextualParser, TextMatcher
A DeIdentification pretrained model can now work in ‘mask’ or ‘obfuscate’ mode

Enhancements

DeIdentification Annotator has a more consistent API:
- mode param with values (‘mask’l’obfuscate’) to drive its behavior
- dateFormats param a list of string values to to select which dateFormats to obfuscate (and which to just mask)
DeIdentification Annotator no longer automatically obfuscates dates. Obfuscation is now driven by mode and dateFormats params
A DeIdentification pretrained model can now work in ‘mask’ or ‘obfuscate’ mode

Bugfixes

DeIdentification Annotator now correctly deduplicates protected entities coming from NER / Regex
DeIdentification Annotator now indexes chunks correctly after merging them
AssertionDLApproach Annotator can now be trained with the graph in any folder specified by setting graphFolder param
AssertionDLApproach now has the setClasses param setter in Python wrapper
JVM Memory and Kryo Max Buffer size increased to 32G and 2000M respectively in sparknlp_jsl.start(secret) function

Versions

Version
Version
Version

PREVIOUSVersion Compatibility