Rule based annotation is supported by Healthcare NLP, Finance NLP, and Legal NLP via the
Users in the
There are two types of rules supported:
-
Regex Based: Users can define a regex that will be used to label all possible hit chunks and label them as the target entity. For example, for labeling height entity the following regex can be used
[0-7]'((0?[0-9])|(1(0|1)))''
. All hits found in the task’s text content that match the regex are pre-annotated as height. -
Dictionary-Based: Users can define and upload a CSV dictionary of keywords that cover the list of chunks that should be annotated as a target entity. For example, for the label female, all occurrences of strings woman, lady, and girl within the text content of a given task will be pre-annotated as female.
After adding a rule, the
The user is notified every time a rule in use is edited with the message “Redeploy preannotation server to apply these changes” on the
Import and Export Rules
Generative AI Lab allows importing and exporting Rules from the Rules page.
Import Rules
Users can import rules from the Rules page. The rules can be both dictionary based or regex based. The rules can be imported in the following formats:
- JSON file or content.
- Zip archive of JSON file/s.
Export Rules
To export any rule, the user need to select the available rules and click on Export Rules button. Rules are then downloaded as a zip file. The zip file contains the JSON file for each rule. These exported rules can again be imported to Generative AI Lab.
The following blog posts explain how to create and use rules for jump starting your annotation projects: