Penalizes tokens proportionally to how often they appeared, reducing repetition.
Frequency penalty reduces the likelihood of repeating the same token based on how many times it already appeared. A token used 5 times gets penalized 5x more than one used once. This directly combats repetitive loops.
Before sampling, the model subtracts a penalty from each token's logit proportional to its count in the output so far. Positive values discourage repetition. The effect is cumulative: the more a token has been used, the harder it becomes to use again.
When the model gets stuck in repetitive loops. For creative writing where you want vocabulary diversity. Not recommended for code generation, where repeating variable names and syntax is necessary.
Setting it too high for code generation. Confusing frequency penalty (proportional to count) with presence penalty (flat penalty for any occurrence). Using it alongside temperature with unpredictable combined effects.
OpenAI: 0 (disabled), range -2 to 2. Anthropic: not exposed. Google: frequencyPenalty 0-1. Typical useful range: 0.1-0.5.