3.5.3
Highlights
- New
rxnorm_mapper
model - New
ChunkMapperFilterer
annotator to filterChunkMapperModel
results - New features
- Add the
setReplaceLabels
parameter that allows replacing the non-conventional labels without using an external source file in theNerConverterInternal()
. - Case sensitivity can be set in
ChunkMapperApproach
andChunkMapperModel
throughsetLowerCase()
parameter. - Return multiple relations at a time in
ChunkMapperModel
models viasetRels()
parameter. - Filter the multi-token chunks separated with whitespace in
ChunkMapperApproach
bysetAllowMultiTokenChunk()
parameter.
- Add the
- New license validation policy in License Validator.
- Bug fixes
- Updated notebooks
- List of recently updated or added models
New rxnorm_mapper
Model
We are releasing rxnorm_mapper
model that maps clinical entities and concepts to corresponding rxnorm codes.
See Model Hub Page for details.
Example :
...
chunkerMapper = ChunkMapperModel.pretrained("rxnorm_mapper", "en", "clinical/models")\
.setInputCols(["ner_chunk"])\
.setOutputCol("mappings")\
.setRel("rxnorm_code")
...
sample_text = "The patient was given Zyrtec 10 MG, Adapin 10 MG Oral Capsule, Septi-Soothe 0.5 Topical Spray"
Results :
+------------------------------+---------------+
|chunk |rxnorm_mappings|
+------------------------------+---------------+
|Zyrtec 10 MG |1011483 |
|Adapin 10 MG Oral Capsule |1000050 |
|Septi-Soothe 0.5 Topical Spray|1000046 |
+------------------------------+---------------+
New ChunkMapperFilterer
Annotator to Filter ChunkMapperModel
Results
ChunkMapperFilterer
annotator allows filtering of the chunks that were passed through the ChunkMapperModel
.
If setReturnCriteria()
is set as "success"
, only the chunks which are mapped by ChunkMapperModel
are returned. Otherwise, if setReturnCriteria()
is set as "fail"
, only the chunks which are not mapped by ChunkMapperModel are returned.
Example :
...
cfModel = ChunkMapperFilterer() \
.setInputCols(["ner_chunk","mappings"]) \
.setOutputCol("chunks_filtered")\
.setReturnCriteria("success") #or "fail"
...
sample_text = "The patient was given Warfarina Lusa and amlodipine 10 mg. Also, he was given Aspagin, coumadin 5 mg and metformin"
.setReturnCriteria("success")
Results :
+-----+---+--------------+--------------+
|begin|end| entity| mappings|
+-----+---+--------------+--------------+
| 22| 35| DRUG|Warfarina Lusa|
+-----+---+--------------+--------------+
.setReturnCriteria("fail")
Results :
+-----+---+--------+------------+
|begin|end| entity| not mapped|
+-----+---+--------+------------+
| 41| 50| DRUG| amlodipine|
| 80| 86| DRUG| Aspagin|
| 89| 96| DRUG| coumadin|
| 115|123| DRUG| metformin|
+-----+---+--------+------------+
New Features:
Add setReplaceLabels
Parameter That Allows Replacing the Non-Conventional Labels Without Using an External Source File in the NerConverterInternal()
.
Now you can replace the labels in NER models with custom labels by using .setReplaceLabels
parameter with NerConverterInternal
annotator. In this way, you will not need to use any other external source file to replace the labels with custom ones.
Example :
...
clinical_ner = MedicalNerModel.pretrained("ner_jsl", "en", "clinical/models")\
.setInputCols(["sentence","token", "word_embeddings"])\
.setOutputCol("ner")
ner_converter_original = NerConverterInternal()\
.setInputCols(["sentence", "token", "ner"]) \
.setOutputCol("original_label")
ner_converter_replaced = NerConverterInternal()\
.setInputCols(["sentence", "token", "ner"]) \
.setOutputCol("replaced_label")\
.setReplaceLabels({"Drug_Ingredient" : "Drug",'Drug_BrandName':'Drug'})
...
sample_text = "The patient was given Warfarina Lusa and amlodipine 10 mg. Also, he was given Aspagin, coumadin 5 mg, and metformin"
Results :
+--------------+-----+---+---------------+--------------+
|chunk |begin|end|original_label |replaced_label|
+--------------+-----+---+---------------+--------------+
|Warfarina Lusa|22 |35 |Drug_BrandName |Drug |
|amlodipine |41 |50 |Drug_Ingredient|Drug |
|10 mg |52 |56 |Strength |Strength |
|he |65 |66 |Gender |Gender |
|Aspagin |78 |84 |Drug_BrandName |Drug |
|coumadin |87 |94 |Drug_Ingredient|Drug |
|5 mg |96 |99 |Strength |Strength |
|metformin |106 |114|Drug_Ingredient|Drug |
+--------------+-----+---+---------------+--------------+
Case Sensitivity in ChunkMapperApproach
and ChunkMapperModel
Through setLowerCase()
Parameter
The case status of ChunkMapperApproach
and ChunkMapperModel
can be set by using setLowerCase()
parameter.
Example :
...
chunkerMapperapproach = ChunkMapperApproach() \
.setInputCols(["ner_chunk"]) \
.setOutputCol("mappings") \
.setDictionary("mappings.json") \
.setRel("action") \
.setLowerCase(True) #or False
...
sentences = [["""The patient was given Warfarina lusa and amlodipine 10 mg, coumadin 5 mg.
The patient was given Coumadin"""]]
setLowerCase(True)
Results :
+------------------------+-----------+
|chunk |mapped |
+------------------------+-----------+
|Warfarina lusa |540228 |
|amlodipine |329526 |
|coumadin |202421 |
|Coumadin |202421 |
+------------------------+-----------+
setLowerCase(False)
Results :
+------------------------+-----------+
|chunk |mapped |
+------------------------+-----------+
|Warfarina lusa |NONE |
|amlodipine |329526 |
|coumadin |NONE |
|Coumadin |202421 |
+------------------------+-----------+
Return Multiple Relations At a Time In ChunkMapper Models Via setRels()
Parameter
Multiple relations for the same chunk can be set with the setRels()
parameter in both ChunkMapperApproach
and ChunkMapperModel
.
Example :
...
chunkerMapperapproach = ChunkMapperApproach() \
.setInputCols(["ner_chunk"]) \
.setOutputCol("mappings") \
.setDictionary("mappings.json") \
.setRels(["action","treatment"]) \
.setLowerCase(True) \
...
sample_text = "The patient was given Warfarina Lusa."
Results :
+-----+---+--------------+-------------+---------+
|begin|end| entity| mappings| relation|
+-----+---+--------------+-------------+---------+
| 22| 35|Warfarina Lusa|Anticoagulant| action|
| 22| 35|Warfarina Lusa|Heart Disease|treatment|
+-----+---+--------------+-------------+---------+
Filter the Multi-Token Chunks Separated With Whitespace in ChunkMapperApproach
and ChunkMapperModel
by setAllowMultiTokenChunk()
Parameter
The chunks that include multi-tokens separated by a whitespace, can be filtered by using setAllowMultiTokenChunk()
parameter.
Example :
...
chunkerMapper = ChunkMapperApproach() \
.setInputCols(["ner_chunk"]) \
.setOutputCol("mappings") \
.setDictionary("mappings.json") \
.setLowerCase(True) \
.setRels(["action", "treatment"]) \
.setAllowMultiTokenChunk(False)
...
sample_text = "The patient was given Warfarina Lusa"
setAllowMultiTokenChunk(False)
Results :
+-----+---+--------------+--------+--------+
|begin|end| chunk|mappings|relation|
+-----+---+--------------+--------+--------+
| 22| 35|Warfarina Lusa| NONE| null|
+-----+---+--------------+--------+--------+
setAllowMultiTokenChunk(True)
Results :
+-----+---+--------------+-------------+---------+
|begin|end| chunk| mappings| relation|
+-----+---+--------------+-------------+---------+
| 22| 35|Warfarina Lusa|Anticoagulant| action|
| 22| 35|Warfarina Lusa|Heart Disease|treatment|
+-----+---+--------------+-------------+---------+
New License Validation Policies in License Validator
A new version of the License Validator has been included in Spark NLP for Healthcare. This License Validator checks the compatibility between the type of your license and the environment you are using, allowing the license to be used only for the environment it was requested (single-node, cluster, databricks, etc) and the number of concurrent sessions (floating or not-floating). You can check which type of license you have in my.johnsnowlabs.com -> My Subscriptions.
If your license stopped working, please contact support@johnsnowlabs.com so that it can be checked the difference between the environment your license was requested for and the one it’s currently being used.
Bug Fixes
We fixed some issues in AnnotationToolJsonReader
tool, DrugNormalizer
and ContextualParserApproach
annotators.
DrugNormalizer
: Fixed some issues that affect the performance.ContextualParserApproach
: Fixed the issue in the computation of indices for documents with more than one sentence while defining the rule-scope field as a document.AnnotationToolJsonReader
: Fixed an issue where relation labels were not being extracted from the Annotation Lab json file export.
Updated Notebooks
- Clinical Named Entity Recognition Notebook
.setReplaceLabels
parameter example was added. - Chunk Mapping Notebook
New case sensitivity, selecting multiple relations, filtering multi-token chunks andChunkMapperFilterer
features were added.
List of Recently Updated Models
sbiobertresolve_icdo_augmented
rxnorm_mapper
For all Spark NLP for healthcare models, please check: Models Hub Page
Versions
- 5.5.2
- 5.5.1
- 5.5.0
- 5.4.1
- 5.4.0
- 5.3.3
- 5.3.2
- 5.3.1
- 5.3.0
- 5.2.1
- 5.2.0
- 5.1.4
- 5.1.3
- 5.1.2
- 5.1.1
- 5.1.0
- 5.0.2
- 5.0.1
- 5.0.0
- 4.4.4
- 4.4.3
- 4.4.2
- 4.4.1
- 4.4.0
- 4.3.2
- 4.3.1
- 4.3.0
- 4.2.8
- 4.2.4
- 4.2.3
- 4.2.2
- 4.2.1
- 4.2.0
- 4.1.0
- 4.0.2
- 4.0.0
- 3.5.3
- 3.5.2
- 3.5.1
- 3.5.0
- 3.4.2
- 3.4.1
- 3.4.0
- 3.3.4
- 3.3.2
- 3.3.1
- 3.3.0
- 3.2.3
- 3.2.2
- 3.2.1
- 3.2.0
- 3.1.3
- 3.1.2
- 3.1.1
- 3.1.0
- 3.0.3
- 3.0.2
- 3.0.1
- 3.0.0
- 2.7.6
- 2.7.5
- 2.7.4
- 2.7.3
- 2.7.2
- 2.7.1
- 2.7.0
- 2.6.2
- 2.6.0
- 2.5.5
- 2.5.3
- 2.5.2
- 2.5.0
- 2.4.6
- 2.4.5
- 2.4.2
- 2.4.1
- 2.4.0