You can pass a map where key will be the tokenized word and value will be the bias which controls the likelihood of that token appearing in your generated response. Example:
{
19045: -10,
58234: 10
}
For example, here 19045 is the tokenized id for
good
and 58234 is tokenized id for
better
. The above logit_bias will reduce the chances of the model generating the word
good
in the completion as its value is negative 10 and vice versa for the word
better
as its value is positive 10.
Reference to a simple article that explains it well:
https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probabilityYou can use this to generate tokenized ids for words (for openai models):
https://platform.openai.com/tokenizer