Why predict probabilities?#

Every module has a predict_proba and predict function. predict_proba returns the probability of each completion given the prompt, i.e., it returns an array of floats from 0 to 1 indicating confidence. predict returns the most likely completion, i.e., it returns a string. You might be wondering why you’d ever use predict_proba, when predict seemingly gives you what you need: a single choice.

In high stakes applications, probability scores can be thresholded to determine whether or not to bypass manual systems. For example, if a model is highly confident that a social media post contains hate speech, then your system can bypass manual review of that post. If it isn’t confident enough, then manual review is needed.

At a higher level, probabilities are useful when making cost-sensitive decisions.[1]

Another application where predicting probabilities turns out to be useful is in “multilabel” tasks. In these tasks, a single piece of text can be labeled or tagged with multiple categories. For example, a tweet can express multiple emotions at the same time. A simple way to have an LLM tag a tweet’s negative emotions is to predict the probability of each one, and then threshold each probability. All negative emotions are processed in parallel to save time, which is also how I power through most days.

Examples#

See this demo for calibration curves. Calibration curves visualize the accuracy of predicted probabilities.

See this demo for an example of solving a multilabel classification task.

References#