Why predict probabilities?#
Every module has a predict_proba and predict function. predict_proba returns
the probability of each completion given the prompt, i.e., it returns an array of floats
from 0 to 1 indicating confidence. predict returns the most likely completion, i.e.,
it returns a string. You might be wondering why you’d ever use predict_proba, when
predict seemingly gives you what you need: a single choice.
In high stakes applications, probability scores can be thresholded to determine whether or not to bypass manual systems. For example, if a model is highly confident that a social media post contains hate speech, then your system can bypass manual review of that post. If it isn’t confident enough, then manual review is needed.
At a higher level, probabilities are useful when making cost-sensitive decisions.[1]
Another application where predicting probabilities turns out to be useful is in “multilabel” tasks. In these tasks, a single piece of text can be labeled or tagged with multiple categories. For example, a tweet can express multiple emotions at the same time. A simple way to have an LLM tag a tweet’s negative emotions is to predict the probability of each one, and then threshold each probability. All negative emotions are processed in parallel to save time, which is also how I power through most days.
Examples#
See this demo for calibration curves. Calibration curves visualize the accuracy of predicted probabilities.
See this demo for an example of solving a multilabel classification task.