cappr.huggingface.classify#
Perform prompt-completion classification using a model which can be loaded via
transformers.AutoModelForCausalLM.from_pretrainedorauto_gptq.AutoGPTQForCausalLM.from_quantized.
You probably just want the predict() or predict_examples() functions :-)
In the implementation, attention block keys and values for prompts are automatically cached and shared across completions.
- cappr.huggingface.classify.cache(model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], prefixes: str | Sequence[str], clear_cache_on_exit: bool = True, logits_all: bool = True)[source]#
In this context, every prompt processed by model_and_tokenizer starts with a fixed prefix. As a result, computations in this context are faster.
- Parameters:
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
prefixes (str | Sequence[str]) – prefix(es) for all strings that will be processed in this context, e.g., a string containing shared prompt instructions, or a string containing instructions and exemplars for few-shot prompting. prefixes and future strings are assumed to be separated by a whitespace.
clear_cache_on_exit (bool, optional) – whether or not to clear the cache and render the returned model and tokenizer unusable when we exit the context. This is important because it saves memory, and makes code more explicit about the model’s state. By default, True
logits_all (bool, optional) – whether or not to have the cached model include logits for all tokens (including the past). By default, past token logits are included
Example
Usage with
predict_proba():import numpy as np from transformers import AutoModelForCausalLM, AutoTokenizer from cappr.huggingface.classify import cache, predict_proba # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") model_and_tokenizer = (model, tokenizer) # Create data prompt_prefix = '''Instructions: complete the sequence. Here are examples: A, B, C => D 1, 2, 3 => 4 Complete this sequence:''' prompts = ["a, b, c =>", "X, Y =>"] completions = ["d", "Z", "Hi"] # Compute with cache( model_and_tokenizer, prompt_prefix ) as cached_model_and_tokenizer: # prompt_prefix and each prompt are separated by a whitespace pred_probs = predict_proba( prompts, completions, cached_model_and_tokenizer ) # The above computation is equivalent to this one: prompts_full = [prompt_prefix + " " + prompt for prompt in prompts] pred_probs_wo_cache = predict_proba( prompts_full, completions, model_and_tokenizer ) assert np.allclose(pred_probs, pred_probs_wo_cache, atol=1e-5) print(pred_probs.round(1)) # [[1. 0. 0.] # [0. 1. 0.]]
Here’s a more complicated example, which might help in explaining usage:
import torch from transformers import AutoModelForCausalLM, AutoTokenizer from cappr.huggingface.classify import cache from cappr.huggingface._utils import ( does_tokenizer_need_prepended_space, logits_texts, ) # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") model_and_tokenizer = (model, tokenizer) # Assume that all strings will be separated by a whitespace delim = " " if not does_tokenizer_need_prepended_space(tokenizer): # for SentencePiece tokenizers like Llama's delim = "" logits = lambda *args, **kwargs: logits_texts(*args, **kwargs)[0] ''' Returns next-token logits for each token in an inputted text. ''' with cache(model_and_tokenizer, "a") as cached_a: with cache(cached_a, delim + "b c") as cached_a_b_c: with cache(cached_a_b_c, delim + "d") as cached_a_b_c_d: logits1 = logits([delim + "e f"], cached_a_b_c_d) logits2 = logits([delim + "x"], cached_a_b_c_d) logits3 = logits([delim + "1 2 3"], cached_a_b_c) logits4 = logits([delim + "b c d"], cached_a) logits_correct = lambda texts, **kwargs: logits( texts, model_and_tokenizer, drop_bos_token=False ) atol = 1e-4 assert torch.allclose(logits1, logits_correct(["a b c d e f"]), atol=atol) assert torch.allclose(logits2, logits_correct(["a b c d x"]), atol=atol) assert torch.allclose(logits3, logits_correct(["a b c 1 2 3"]), atol=atol) assert torch.allclose(logits4, logits_correct(["a b c d"]), atol=atol)
- cappr.huggingface.classify.cache_model(model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], prefixes: str | Sequence[str], logits_all: bool = True) tuple[ModelForCausalLM, PreTrainedTokenizerBase][source]#
Caches the model so that every future computation with it starts with prefixes. As a result, computations with this model are faster.
Use this function instead of the context manager
cache()to keep the cache for future computations, including those outside of a context.- Parameters:
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
prefixes (str | Sequence[str]) – prefix(es) for all future strings that will be processed, e.g., a string containing shared prompt instructions, or a string containing instructions and exemplars for few-shot prompting. prefixes and future strings are assumed to be separated by a whitespace.
logits_all (bool, optional) – whether or not to have the cached model include logits for all tokens (including the past). By default, past token logits are included
- Returns:
cached model and the (unmodified) tokenizer
- Return type:
tuple[ModelForCausalLM, PreTrainedTokenizerBase]
Example
Usage with
predict_proba():import numpy as np from transformers import AutoModelForCausalLM, AutoTokenizer from cappr.huggingface.classify import cache_model, predict_proba # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") model_and_tokenizer = (model, tokenizer) # Create data prompt_prefix = '''Instructions: complete the sequence. Here are examples: A, B, C => D 1, 2, 3 => 4 Complete this sequence:''' prompts = ["a, b, c =>", "X, Y =>"] completions = ["d", "Z", "Hi"] # Cache cached_model_and_tokenizer = cache_model( model_and_tokenizer, prompt_prefix ) # Compute pred_probs = predict_proba( prompts, completions, cached_model_and_tokenizer ) # The above computation is equivalent to this one: prompts_full = [prompt_prefix + " " + prompt for prompt in prompts] pred_probs_wo_cache = predict_proba( prompts_full, completions, model_and_tokenizer ) assert np.allclose(pred_probs, pred_probs_wo_cache, atol=1e-5) print(pred_probs.round(1)) # [[1. 0. 0.] # [0. 1. 0.]]
- cappr.huggingface.classify.log_probs_conditional(prompts: str | Sequence[str], completions: Sequence[str], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], end_of_prompt: Literal[' ', ''] = ' ', show_progress_bar: bool | None = None, batch_size: int = 2, batch_size_completions: int | None = None, **kwargs) list[list[float]] | list[list[list[float]]][source]#
Log-probabilities of each completion token conditional on each prompt and previous completion tokens.
- Parameters:
prompts (str | Sequence[str]) – string(s), where, e.g., each contains the text you want to classify
completions (Sequence[str]) – strings, where, e.g., each one is the name of a class which could come after a prompt
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
end_of_prompt (Literal[' ', ''], optional) – whitespace or empty string to join prompt and completion, by default whitespace
show_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 prompts
batch_size (int, optional) – the maximum number of prompts that the model will process in parallel, by default 2
batch_size_completions (int, optional) – the maximum number of completions that the model will process in parallel. By default, all completions are processed in parallel
- Returns:
log_probs_completions – If prompts is a string, then a 2-D list is returned: log_probs_completions[completion_idx][completion_token_idx] is the log-probability of the completion token in completions[completion_idx], conditional on prompt + end_of_prompt and previous completion tokens.
If prompts is a sequence of strings, then a 3-D list is returned: log_probs_completions[prompt_idx][completion_idx][completion_token_idx] is the log-probability of the completion token in completions[completion_idx], conditional on prompts[prompt_idx] + end_of_prompt and previous completion tokens.
- Return type:
list[list[float]] | list[list[list[float]]]
Note
To efficiently aggregate log_probs_completions, use
cappr.utils.classify.agg_log_probs().Example
Here we’ll use single characters (which are single tokens) to more clearly demonstrate what this function does:
from transformers import AutoModelForCausalLM, AutoTokenizer from cappr.huggingface.classify import log_probs_conditional # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") # Create data prompts = ["x y", "a b c"] completions = ["z", "d e"] # Compute log_probs_completions = log_probs_conditional( prompts, completions, model_and_tokenizer=(model, tokenizer) ) # Outputs (rounded) next to their symbolic representation print(log_probs_completions[0]) # [[-4.5], [[log Pr(z | x, y)], # [-5.6, -3.2]] [log Pr(d | x, y), log Pr(e | x, y, d)]] print(log_probs_completions[1]) # [[-9.7], [[log Pr(z | a, b, c)], # [-0.2, -0.03]] [log Pr(d | a, b, c), log Pr(e | a, b, c, d)]]
- cappr.huggingface.classify.log_probs_conditional_examples(examples: Example | Sequence[Example], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], show_progress_bar: bool | None = None, batch_size: int = 2, batch_size_completions: int | None = None) list[list[float]] | list[list[list[float]]][source]#
Log-probabilities of each completion token conditional on each prompt and previous completion tokens.
- Parameters:
examples (Example | Sequence[Example]) – Example object(s), where each contains a prompt and its set of possible completions
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
show_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 examples
batch_size (int, optional) – the maximum number of examples that the model will process in parallel, by default 2
batch_size_completions (int, optional) – the maximum number of completions that the model will process in parallel. By default, all completions are processed in parallel
- Returns:
log_probs_completions – If examples is a
cappr.Example, then a 2-D list is returned: log_probs_completions[completion_idx][completion_token_idx] is the log-probability of the completion token in example.completions[completion_idx], conditional on example.prompt + example.end_of_prompt and previous completion tokens.If examples is a sequence of
cappr.Exampleobjects, then a 3-D list is returned: log_probs_completions[example_idx][completion_idx][completion_token_idx] is the log-probability of the completion token in examples[example_idx].completions[completion_idx], conditional on examples[example_idx].prompt + examples[example_idx].end_of_prompt and previous completion tokens.- Return type:
list[list[float]] | list[list[list[float]]]
Note
To aggregate log_probs_completions, use
cappr.utils.classify.agg_log_probs().Note
The attribute
cappr.Example.prioris unused.Example
Here we’ll use single characters (which are single tokens) to more clearly demonstrate what this function does:
from transformers import AutoModelForCausalLM, AutoTokenizer from cappr import Example from cappr.huggingface.classify import log_probs_conditional_examples # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") # Create examples examples = [ Example(prompt="x y", completions=("z", "d e")), Example(prompt="a b c", completions=("1 2",), normalize=False), ] # Compute log_probs_completions = log_probs_conditional_examples( examples, model_and_tokenizer=(model, tokenizer) ) # Outputs (rounded) next to their symbolic representation print(log_probs_completions[0]) # corresponds to examples[0] # [[-4.5], [[log Pr(z | x, y)], # [-5.6, -3.2]] [log Pr(d | x, y), log Pr(e | x, y, d)]] print(log_probs_completions[1]) # corresponds to examples[1] # [[-5.0, -1.7]] [[log Pr(1 | a, b, c)], log Pr(2 | a, b, c, 1)]]
- cappr.huggingface.classify.predict(prompts: str | Sequence[str], completions: Sequence[str], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], prior: Sequence[float] | None = None, end_of_prompt: Literal[' ', ''] = ' ', discount_completions: float = 0.0, log_marg_probs_completions: Sequence[Sequence[float]] | None = None, show_progress_bar: bool | None = None, batch_size: int = 2, batch_size_completions: int | None = None) str | list[str][source]#
Predict which completion is most likely to follow each prompt.
- Parameters:
prompts (str | Sequence[str]) – string(s), where, e.g., each contains the text you want to classify
completions (Sequence[str]) – strings, where, e.g., each one is the name of a class which could come after a prompt
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
prior (Sequence[float] | None, optional) – a probability distribution over completions, representing a belief about their likelihoods regardless of the prompt. By default, each completion in completions is assumed to be equally likely
end_of_prompt (Literal[' ', ''], optional) – whitespace or empty string to join prompt and completion, by default whitespace
discount_completions (float, optional) – experimental feature: set it to >0.0 (e.g., 1.0 may work well) if a completion is consistently getting over-predicted. You could instead fudge the prior, but this hyperparameter may be easier to tune than the prior. By default 0.0
log_marg_probs_completions (Sequence[Sequence[float]] | None, optional) – experimental feature: pre-computed log probabilities of completion tokens conditional on previous completion tokens (not prompt tokens). Only used if not discount_completions. Pre-compute them by passing completions, model, and end_of_prompt to
token_logprobs(). By default, if not discount_completions, they are (re-)computedshow_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 prompts
batch_size (int, optional) – the maximum number of prompts that the model will process in parallel, by default 2
batch_size_completions (int, optional) – the maximum number of completions that the model will process in parallel. By default, all completions are processed in parallel
- Returns:
preds – If prompts is a string, then the completion from completions which is predicted to most likely follow prompt + end_of_prompt is returned.
If prompts is a sequence of strings, then a list with length len(prompts) is returned. preds[prompt_idx] is the completion in completions which is predicted to follow prompts[prompt_idx] + end_of_prompt.
- Return type:
str | list[str]
Note
In this function, the set of possible completions which could follow each prompt is the same for every prompt. If instead, each prompt could be followed by a different set of completions, then construct a sequence of
cappr.Exampleobjects and pass them topredict_examples().Example
Let’s have GPT-2 (small) predict where stuff is in the kitchen:
from transformers import AutoModelForCausalLM, AutoTokenizer from cappr.huggingface.classify import predict # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") # Define a classification task prompts = ["The tacos are cooking", "Ice cream is"] class_names = ("on the stove", "in the freezer", "in the fridge") prior = (1 / 5, 2 / 5, 2 / 5) preds = predict( prompts, completions=class_names, model_and_tokenizer=(model, tokenizer), prior=prior, ) print(preds) # ['on the stove', # 'in the freezer']
- cappr.huggingface.classify.predict_examples(examples: Example | Sequence[Example], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], show_progress_bar: bool | None = None, batch_size: int = 2, batch_size_completions: int | None = None) str | list[str][source]#
Predict which completion is most likely to follow each prompt.
- Parameters:
examples (Example | Sequence[Example]) – Example object(s), where each contains a prompt and its set of possible completions
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
show_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 examples
batch_size (int, optional) – the maximum number of examples that the model will process in parallel, by default 2
batch_size_completions (int, optional) – the maximum number of completions that the model will process in parallel. By default, all completions are processed in parallel
- Returns:
preds – If examples is an
cappr.Example, then the completion from example.completions which is predicted to most likely follow example.prompt + example.end_of_prompt is returned.If examples is a sequence of
cappr.Exampleobjects, then a list with length len(examples) is returned: preds[example_idx] is the completion in examples[example_idx].completions which is predicted to most likely follow examples[example_idx].prompt + examples[example_idx].end_of_prompt.- Return type:
str | list[str]
Example
GPT-2 (small) doing media trivia:
from transformers import AutoModelForCausalLM, AutoTokenizer from cappr import Example from cappr.huggingface.classify import predict_examples # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") # Create examples examples = [ Example( prompt="Jodie Foster played", completions=("Clarice Starling", "Trinity in The Matrix"), ), Example( prompt="Batman, from Batman: The Animated Series, was played by", completions=("Pete Holmes", "Kevin Conroy", "Spongebob!"), prior=(1 / 3, 2 / 3, 0), ), ] preds = predict_examples( examples, model_and_tokenizer=(model, tokenizer) ) print(preds) # ['Clarice Starling', # 'Kevin Conroy']
- cappr.huggingface.classify.predict_proba(prompts: str | Sequence[str], completions: Sequence[str], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], prior: Sequence[float] | None = None, end_of_prompt: Literal[' ', ''] = ' ', normalize: bool = True, discount_completions: float = 0.0, log_marg_probs_completions: Sequence[Sequence[float]] | None = None, show_progress_bar: bool | None = None, batch_size: int = 2, batch_size_completions: int | None = None) npt.NDArray[np.floating][source]#
Predict probabilities of each completion coming after each prompt.
- Parameters:
prompts (str | Sequence[str]) – string(s), where, e.g., each contains the text you want to classify
completions (Sequence[str]) – strings, where, e.g., each one is the name of a class which could come after a prompt
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
prior (Sequence[float] | None, optional) – a probability distribution over completions, representing a belief about their likelihoods regardless of the prompt. By default, each completion in completions is assumed to be equally likely
end_of_prompt (Literal[' ', ''], optional) – whitespace or empty string to join prompt and completion, by default whitespace
normalize (bool, optional) – whether or not to normalize completion-after-prompt probabilities into a probability distribution over completions. Set this to False if you’d like the raw completion-after-prompt probability, or you’re solving a multi-label prediction problem. By default, True
discount_completions (float, optional) – experimental feature: set it (e.g., 1.0 may work well) if a completion is consistently getting too high predicted probabilities. You could instead fudge the prior, but this hyperparameter may be easier to tune than the prior. By default 0.0
log_marg_probs_completions (Sequence[Sequence[float]] | None, optional) – experimental feature: pre-computed log probabilities of completion tokens conditional on previous completion tokens (not prompt tokens). Only used if not discount_completions. Pre-compute them by passing completions, model, and end_of_prompt to
token_logprobs(). By default, if not discount_completions, they are (re-)computedshow_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 prompts
batch_size (int, optional) – the maximum number of prompts that the model will process in parallel, by default 2
batch_size_completions (int, optional) – the maximum number of completions that the model will process in parallel. By default, all completions are processed in parallel
- Returns:
pred_probs – If prompts is a string, then an array with shape len(completions), is returned: pred_probs[completion_idx] is the model’s estimate of the probability that completions[completion_idx] comes after prompt + end_of_prompt.
If prompts is a sequence of strings, then an array with shape (len(prompts), len(completions)) is returned: pred_probs[prompt_idx, completion_idx] is the model’s estimate of the probability that completions[completion_idx] comes after prompts[prompt_idx] + end_of_prompt.
- Return type:
npt.NDArray[np.floating]
Note
In this function, the set of possible completions which could follow each prompt is the same for every prompt. If instead, each prompt could be followed by a different set of completions, then construct a sequence of
cappr.Exampleobjects and pass them topredict_proba_examples().Example
Let’s have GPT-2 (small) predict where stuff is in the kitchen. This example also conveys that it’s not the greatest model out there:
from transformers import AutoModelForCausalLM, AutoTokenizer from cappr.huggingface.classify import predict_proba # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") # Define a classification task prompts = ["The tacos are cooking", "Ice cream is"] class_names = ("on the stove", "in the freezer", "in the fridge") prior = (1 / 5, 2 / 5, 2 / 5) pred_probs = predict_proba( prompts, completions=class_names, model_and_tokenizer=(model, tokenizer), prior=prior, ) pred_probs_rounded = pred_probs.round(1) # just for cleaner output # predicted probability that tacos cook on the stove print(pred_probs_rounded[0, 0]) # 0.4 # predicted probability that ice cream is in the freezer print(pred_probs_rounded[1, 1]) # 0.5 # predicted probability that ice cream is in the fridge print(pred_probs_rounded[1, 2]) # 0.4
- cappr.huggingface.classify.predict_proba_examples(examples: Example | Sequence[Example], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], show_progress_bar: bool | None = None, batch_size: int = 2, batch_size_completions: int | None = None) npt.NDArray[np.floating] | list[npt.NDArray[np.floating]][source]#
Predict probabilities of each completion coming after each prompt.
- Parameters:
examples (Example | Sequence[Example]) – Example object(s), where each contains a prompt and its set of possible completions
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
show_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 examples
batch_size (int, optional) – the maximum number of examples that the model will process in parallel, by default 2
batch_size_completions (int, optional) – the maximum number of completions that the model will process in parallel. By default, all completions are processed in parallel
- Returns:
pred_probs – If examples is an
cappr.Example, then an array with shape (len(example.completions),) is returned: pred_probs[completion_idx] is the model’s estimate of the probability that example.completions[completion_idx] comes after example.prompt + example.end_of_prompt.If examples is a sequence of
cappr.Exampleobjects, then a list with length len(examples) is returned: pred_probs[example_idx][completion_idx] is the model’s estimate of the probability that examples[example_idx].completions[completion_idx] comes after examples[example_idx].prompt + examples[example_idx].end_of_prompt. If the number of completions per example is a constant k, then an array with shape (len(examples), k) is returned instead of a list of 1-D arrays.- Return type:
npt.NDArray[np.floating] | list[npt.NDArray[np.floating]]
Example
GPT-2 (small) doing media trivia:
from transformers import AutoModelForCausalLM, AutoTokenizer from cappr import Example from cappr.huggingface.classify import predict_proba_examples # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") # Create examples examples = [ Example( prompt="Jodie Foster played", completions=("Clarice Starling", "Trinity in The Matrix"), ), Example( prompt="Batman, from Batman: The Animated Series, was played by", completions=("Pete Holmes", "Kevin Conroy", "Spongebob!"), prior=(1 / 3, 2 / 3, 0), ), ] pred_probs = predict_proba_examples( examples, model_and_tokenizer=(model, tokenizer) ) # predicted probability that Jodie Foster played Clarice Starling, not Trinity print(pred_probs[0][0].round(2)) # 0.7 # predicted probability that Batman was played by Kevin Conroy print(pred_probs[1][1].round(2)) # 0.97
- cappr.huggingface.classify.token_logprobs(texts: str | Sequence[str], model_and_tokenizer: tuple[ModelForCausalLM, PreTrainedTokenizerBase], end_of_prompt: Literal[' ', ''] = ' ', show_progress_bar: bool | None = None, add_bos: bool = False, batch_size: int = 16, **kwargs) list[float] | list[list[float]][source]#
For each text, compute each token’s log-probability conditional on all previous tokens in the text.
- Parameters:
texts (str | Sequence[str]) – input text(s)
model_and_tokenizer (tuple[ModelForCausalLM, PreTrainedTokenizerBase]) – a model and its tokenizer
end_of_prompt (Literal[' ', ''], optional) – This string gets added to the beginning of each text. It’s important to set this if you’re using the discount feature. Otherwise, set it to “”. By default ” “
show_progress_bar (bool | None, optional) – whether or not to show a progress bar. By default, it will be shown only if there are at least 5 texts
add_bos (bool, optional) – whether or not to add a beginning-of-sentence token to each text in texts if the tokenizer has a beginning-of-sentence token, by default False
batch_size (int, optional) – the maximum number of texts that the model will process in parallel, by default 16
- Returns:
log_probs – If texts is a string, then a 1-D list is returned: log_probs[token_idx] is the log-probability of the token at token_idx of texts conditional on all previous tokens in texts.
If texts is a sequence of strings, then a 2-D list is returned: log_probs[text_idx][token_idx] is the log-probability of the token at token_idx of texts[text_idx] conditional on all previous tokens in texts[text_idx].
- Return type:
list[float] | list[list[float]]
Warning
Set end_of_prompt=””, add_bos=True unless you’re using the discount feature.
Note
For each text, the first token’s log-probability is always
Nonebecause no autoregressive LM directly estimates the marginal probability of a token.- Raises:
TypeError – if texts is not a sequence
ValueError – if texts is empty