(Optional) Supply a prior#

A prior is a probability distribution over completions indicating how likely you think each completion is regardless of the prompt. It nudges language model probabilities towards the domain-specific probabilities which are needed to make optimal predictions.

If you have a handful of examples whose correct class/choice is known, then you may simply compute the fraction of examples belonging to each class, e.g.,

# class_labels[i] is the index of the class which example i belongs to
# There are 3 possible classes, indexed as 0, 1, and 2
class_labels = [0, 0, 0, 1, 1, 1, 1, 1, 2]

# prior[k] is the observed fraction of examples which belong to class k
prior = [3/9, 5/9, 1/9]

There are better but slighly more complicated ways to estimate a prior, e.g., additive smoothing. A prior may be guessed based on domain knowledge.

If you have absolutely no idea what a reasonable prior could be, then leave out the prior keyword argument for predict and predict_proba functions.

Examples#

See the Banking 77 demo.

For a minimal example of using a prior, see the Example section for this function:

cappr.huggingface.classify.predict()