whether or not to return an all distance values in the metadata.
number of results to return in the metadata after sorting by last distance calculated
what function to use to calculate confidence: INVERSE or SOFTMAX
what distance function to use for KNN: 'EUCLIDEAN' or 'COSINE'
distance weights to apply before pooling: [WMD, TFIDF, Jaccard, SorensenDice, JaroWinkler, Levenshtein]
whether or not to use Jaccard token distance.
whether or not to use Jaro-Winkler character distance.
whether or not to use Levenshtein character distance.
whether or not to use Sorensen-Dice token distance.
whether or not to use TFIDF token distance.
whether or not to use WMD token distance.
penalty for extra words in the knowledge base match during WMD calculation
column name for the value we are trying to resolve
whether or not to return an empty annotation on unmatched chunks
number of neighbours to consider in the KNN query to calculate WMD
column name for the original, normalized description
pooling strategy to aggregate distances: AVERAGE or SUM
threshold value for the aggregated distance
Returns the ChunkEntityResolverModel Transformer, that can be used to transform input datasets
a Dataset containing ChunkTokens, ChunkEmbeddings, ClassifierLabel, ResolverLabel, [ResolverNormalized]
a trained ChunkEntityResolverModel