The following tables give an overview on the different tutorials with the 1-liners.
The tables are grouped
by category.
Embeddings Tutorials Overview
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Albert Word Embeddings | albert, sentiment pos albert emotion |
Albert-Paper, Albert on Github, Albert on TensorFlow, T-SNE, T-SNE-Albert, Albert_Embedding | |
| Bert Word Embeddings | bert, pos sentiment emotion bert |
Bert-Paper, Bert Github, T-SNE, T-SNE-Bert, Bert_Embedding | |
| BIOBERT Word Embeddings | biobert , sentiment pos biobert emotion |
BioBert-Paper, Bert Github , BERT: Deep Bidirectional Transformers, Bert Github, T-SNE, T-SNE-Biobert, Biobert_Embedding | |
| COVIDBERT Word Embeddings | covidbert, sentiment covidbert pos |
CovidBert-Paper, Bert Github, T-SNE, T-SNE-CovidBert, Covidbert_Embedding | |
| ELECTRA Word Embeddings | electra, sentiment pos en.embed.electra emotion |
Electra-Paper, T-SNE, T-SNE-Electra, Electra_Embedding | |
| ELMO Word Embeddings | elmo, sentiment pos elmo emotion |
ELMO-Paper, Elmo-TensorFlow, T-SNE, T-SNE-Elmo, Elmo-Embedding | |
| GLOVE Word Embeddings | glove, sentiment pos glove emotion |
Glove-Paper, T-SNE, T-SNE-Glove , Glove_Embedding | |
| XLNET Word Embeddings | xlnet, sentiment pos xlnet emotion |
XLNet-Paper, Bert Github, T-SNE, T-SNE-XLNet, Xlnet_Embedding | |
| Multiple Word-Embeddings and Part of Speech in 1 Line of code | bert electra elmo glove xlnet albert pos |
Bert-Paper, Albert-Paper, ELMO-Paper, Electra-Paper, XLNet-Paper, Glove-Paper |
Text Preprocessing and Cleaning
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Normalzing | norm |
- | |
| Detect sentences | sentence_detector.deep, sentence_detector.pragmatic, xx.sentence_detector |
Sentence Detector | |
| Spellchecking | n.a. | n.a. | - |
| Stemming | en.stem, de.stem |
- | |
| Stopwords removal | stopwords |
Stopwords | |
| Tokenization | tokenize |
- | |
| Normalization of Documents | norm_document |
- |
Sequence to Sequence
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Open and Closed book question answering with Google’s T5 | en.t5 , answer_question |
T5-Paper, T5-Model | |
| Overview of every task available with T5 | en.t5.base |
T5-Paper, T5-Model | |
| Translate between more than 200 Languages in 1 line of code with Marian Models | tr.translate_to.fr, en.translate_to.fr ,fr.translate_to.he , en.translate_to.de |
Marian-Papers, Translation-Pipeline (En to Fr), Translation-Pipeline (En to Ger) | |
| Text Generation with Google’s T5 | ten.text_generator.biomedical_biogpt_base, en.text_generator.generic_flan_base ,en.text_generator.generic_jsl_base , en.text_generator.generic_flan_t5_large , en.text_generator.biogpt_chat_jsl , en.text_generator.biogpt_chat_jsl_conversational , en.text_generator.biogpt_chat_jsl_conditions |
T5-Paper, T5-Model | |
| Bart Transformer | en.seq2seq.distilbart_xsum_12_6, en.seq2seq.bart_large_cnn ,en.seq2seq.distilbart_cnn_6_6 , en.seq2seq.distilbart_cnn_12_6 , en.seq2seq.distilbart_xsum_6_6 |
Bart-Paper |
Sentence Embeddings
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| BERT Sentence Embeddings | embed_sentence.bert, pos sentiment embed_sentence.bert |
Bert-Paper, Bert Github, Bert-Sentence_Embedding | |
| ELECTRA Sentence Embeddings | embed_sentence.electra, pos sentiment embed_sentence.electra |
Electra Paper, Sentence-Electra-Embedding | |
| USE Sentence Embeddings | use, pos sentiment use emotion |
Universal Sentence Encoder, USE-TensorFlow, Sentence-USE-Embedding | |
| Sentence similarity using BERT embeddings | embed_sentence.bert, use en.embed_sentence.electra embed_sentence.bert |
Bert-Paper, Bert Github, Bert-Sentence_Embedding |
Part of Speech
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Part of Speech tagging | pos |
Part of Speech |
Named Entity Recognition (NER)
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| NER Aspect Airline ATIS | en.ner.aspect.airline |
NER Airline Model, Atis intent Dataset | |
| NLU-NER_CONLL_2003_5class_example | ner |
NER-Piple | |
| Named-entity recognition with Deep Learning ONTO NOTES | ner.onto |
NER_Onto | |
| Aspect based NER-Sentiment-Restaurants | en.ner.aspect_sentiment |
- |
Multilingual Tasks
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Chinese | zh.segment_words, zh.pos, zh.ner, zh.translate_to.en |
Translation-Pipeline (Zh to En) | |
| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Japanese | ja.segment_words, ja.pos, ja.ner, ja.translate_to.en |
Translation-Pipeline (Ja to En) | |
| Detect Named Entities (NER), Part of Speech Tags (POS) and Tokenize in Korean | ko.segment_words, ko.pos, ko.ner.kmou.glove_840B_300d, ko.translate_to.en |
- |
Matchers
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Date Matching | match.datetime |
- |
Dependency Parsing
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Typed Dependency Parsing | dep |
Dependency Parsing | |
| Untyped Dependency Parsing | dep.untyped |
- |
Classifiers
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| E2E Classification | e2e |
e2e-Model | |
| Language Classification | lang |
- | |
| Cyberbullying Classification | classify.cyberbullying |
Cyberbullying-Classifier | |
| Sentiment Classification for Twitter | emotion |
Emotion detection | |
| Fake News Classification | en.classify.fakenews |
Fakenews-Classifier | |
| Intent Classification | en.classify.intent.airline |
Airline-Intention classifier, Atis-Dataset | |
| Question classification based on the TREC dataset | en.classify.questions |
Question-Classifier | |
| Sarcasm Classification | en.classify.sarcasm |
Sarcasm-Classifier | |
| Sentiment Classification for Twitter | en.sentiment.twitter |
Sentiment_Twitter-Classifier | |
| Sentiment Classification for Movies | en.sentiment.imdb |
Sentiment_imdb-Classifier | |
| Spam Classification | en.classify.spam |
Spam-Classifier | |
| Toxic text classification | en.classify.toxic |
Toxic-Classifier | |
| Unsupervised keyword extraction using the YAKE algorithm | yake |
- | |
| Notebook for Classification of Banking Queries | en.classify.distilbert_sequence.banking77 |
DistilBERT Sequence Classification - Banking77 | |
| Notebook for Classification of Intent in Texts | en.ner.snips |
Identify intent in general text - SNIPS dataset | |
| Notebook for classification of Similar Questions | en.classify.questionpair |
Question Pair Classifier | |
| Notebook for Classification of Questions vs Statements | en.classify.question_vs_statement |
Bert for Sequence Classification (Question vs Statement) | |
| Notebook for Classification of News into 4 classes | en.classify.distilbert_sequence.ag_news |
DistilBERT Sequence Classification Base - AG News (distilbert_base_sequence_classifier_ag_news) | |
| ConvNext Image Classification | en.classify_image.convnext.tiny |
A ConvNet for the 2020s |
Chunkers
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Grammatical Chunk Matching | match.chunks |
- | |
| Getting n-Grams | ngram |
- |
Healthcare
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Assertion | en.med_ner.clinical en.assert, en.med_ner.clinical.biobert en.assert.biobert, … |
Healthcare-NER, NER_Clinical-Classifier, Toxic-Classifier | |
| De-Identification Model overview | med_ner.jsl.wip.clinical en.de_identify, med_ner.jsl.wip.clinical en.de_identify.clinical, … |
NER-Clinical | |
| Drug Normalization | norm_drugs |
- | |
| Entity Resolution | med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical, med_ner.jsl.wip.clinical en.resolve.icd10cm, … |
NER-Clinical, Entity-Resolver clinical | |
| Medical Named Entity Recognition | en.med_ner.ade.clinical, en.med_ner.ade.clinical_bert, en.med_ner.anatomy,en.med_ner.anatomy.biobert, … |
- | |
| Relation Extraction | en.med_ner.jsl.wip.clinical.greedy en.relation, en.med_ner.jsl.wip.clinical.greedy en.relation.bodypart.problem, … |
- |
Visualization
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Visualization of NLP-Models with Spark-NLP and NLU | ner, dep.typed, med_ner.jsl.wip.clinical resolve_chunk.rxnorm.in, med_ner.jsl.wip.clinical resolve.icd10cm |
NER-Piple, Dependency Parsing, NER-Clinical, Entity-Resolver (Chunks) clinical |
Example Notebooks on Kaggle, Examination on real life Problems.
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| NLU Covid-19 Emotion Showcase | emotion |
Emotion detection | |
| NLU Covid-19 Sentiment Showcase | sentiment |
Sentiment classification | |
| NLU Airline Emotion Demo | emotion |
Emotion detection | |
| NLU Airline Sentiment Demo | sentiment |
Sentiment classification |
Release Notebooks
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Bengali NER Hindi Embeddings for 30 Models | bn.ner, bn.lemma, ja.lemma, am.lemma, bh.lemma, en.ner.onto.bert.small_l2_128,.. |
Bengali-NER, Bengali-Lemmatizer, Japanese-Lemmatizer, Amharic-Lemmatizer | |
| Entity Resolution | med_ner.jsl.wip.clinical en.resolve.umls, med_ner.jsl.wip.clinical en.resolve.loinc, med_ner.jsl.wip.clinical en.resolve.loinc.biobert |
- |
Crash-Course
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| NLU 20 Minutes Crashcourse - the fast Data Science route | spell, sentiment, pos, ner, yake, en.t5, emotion, answer_question, en.t5.base … |
T5-Model, Part of Speech, NER-Piple, Emotion detection , Spellchecker, Sentiment classification |
Natural Language Processing (NLP)
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Chapter 0: Intro: 1-liners | sentiment, pos, ner, bert, elmo, embed_sentence.bert |
Part of Speech, NER-Piple, Sentiment classification, Elmo-Embedding, Bert-Sentence_Embedding | |
| Chapter 1: NLU base-features with some classifiers on testdata | emotion, yake, stem |
Emotion detection | |
| Chapter 2: Translation between 300+ langauges with Marian | tr.translate_to.en, en.translate_to.fr, en.translate_to.he |
Translation-Pipeline (En to Fr), Translation (En to He) | |
| Chapter 3: Answer questions and summarize Texts with T5 | answer_question, en.t5, en.t5.base |
T5-Model | |
| Chapter 4: Overview of T5-Tasks | en.t5.base |
T5-Model |
NLU-Crashcourse Graph AI
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Graph NLU 20 Minutes Crashcourse - State of the Art Text Mining for Graphs | spell, sentiment, pos, ner, yake, emotion, med_ner.jsl.wip.clinical, … |
Part of Speech, NER-Piple, Emotion detection, Spellchecker, Sentiment classification |
Healthcare-Training
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Healthcare | med_ner.human_phenotype.gene_biobert, med_ner.ade_biobert, med_ner.anatomy, med_ner.bacterial_species,… |
- |
Multilingual-Training
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Part 0: Intro: 1-liners | spell, sentiment, pos, ner, bert, elmo, embed_sentence.bert |
Bert-Paper, Bert Github, T-SNE, T-SNE-Bert , Part of Speech, NER-Piple, Spellchecker, Sentiment classification, Elmo-Embedding , Bert-Sentence_Embedding | |
| Part 1: Quick Start, base-features with some classifiers on Testdata | yake, stem, ner, emotion |
NER-Piple, Emotion detection | |
| Part 2: Translate between 200+ Languages in 1 line of code with Marian-Models | en.translate_to.de, en.translate_to.fr, en.translate_to.he |
Translation-Pipeline (En to Fr), Translation-Pipeline (En to Ger), Translation (En to He) | |
| Part 3: More Multilingual NLP-translations for Asian Languages with Marian | en.translate_to.hi, en.translate_to.ru, en.translate_to.zh |
Translation (En to Hi), Translation (En to Ru), Translation (En to Zh) | |
| Part 4: Unsupervised Chinese Keyword Extraction, NER and Translation from chinese news | zh.translate_to.en, zh.segment_words, yake, zh.lemma, zh.ner |
Translation-Pipeline (Zh to En), Zh-Lemmatizer | |
| Part 5: Multilingual sentiment classifier training for 100+ languages | train.sentiment, xx.embed_sentence.labse train.sentiment |
n.a. | Sentence_Embedding.Labse |
| Part 6: Question-answering and Text-summarization with T5-Modell | answer_question, en.t5, en.t5.base |
T5-Paper | |
| Part 7: Overview of all tasks available with T5 | en.t5.base |
T5-Paper | |
| Part 8: Overview of some of the Multilingual modes with State Of the Art accuracy (1-liner) | bn.lemma, ja.lemma, am.lemma, bh.lemma, zh.segment_words, … |
Bengali-Lemmatizer, Japanese-Lemmatizer , Amharic-Lemmatizer |
Multilinigual-Examples
| Tutorial Description | 1-liners used | Open In Colab | Dataset and Paper References |
|---|---|---|---|
| Overview of some Multilingual modes avaiable with State Of the Art accuracy (1-liner) | bn.ner.cc_300d, ja.ner, zh.ner, th.ner.lst20.glove_840B_300D, ar.ner |
Bengali-NER | |
| NLU 20 Minutes Crashcourse - the fast Data Science route |
PREVIOUSdashboard()