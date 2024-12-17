In artificial intelligence (AI) and machine learning, temperature is a parameter for adjusting the output of large language models (LLMs). Temperature controls the randomness of text that is generated by LLMs during inference. LLMs generate text by predicting the next word (or rather, the next token) according to a probability distribution. Each token is assigned a logit (numerical value) from the LLM and the total set of tokens is normalized into a “softmax probability distribution.” Each token is assigned a “softmax function” that exists between zero and one, and the sum of all the tokens’ softmax probabilities is one. The LLM temperature parameter modifies this distribution. A lower temperature essentially makes those tokens with the highest probability more likely to be selected; a higher temperature increases a model's likelihood of selecting less probable tokens. This happens because a higher temperature value introduces more variability into the LLM's token selection. Different temperature settings essentially introduce different levels of randomness when the model is generating text output.

Temperature is a crucial feature for controlling randomness in model performance. It allows users to adjust the LLM output to better suit different real-world applications of text generation. More specifically, this LLM setting allows users to balance coherence and creativity when generating output for a specific use case. For instance, a low temperature might be preferable for tasks requiring precision and factual accuracy, such as technical documentation or conversational replies with chatbots. The lower temperature value helps the LLM to produce more coherent and consistent text and avoid irrelevant responses. By contrast, a high temperature is preferable for creative outputs or creative tasks such as creative writing or concept brainstorming. The temperature setting effectively allows users to fine-tune LLMs and adjust a model's output to their own desired outcome.

Temperature is often conflated with ‘creativity’ but this isn’t always the case. It’s more helpful to think of it as how broadly the model uses text from its training data. Max Peeperkorn et al1 conducted an empirical analysis of LLM output for different temperature values and wrote:

“We find that temperature is weakly correlated with novelty, and unsurprisingly, moderately correlated with incoherence, but there is no relationship with either cohesion or typicality. However, the influence of temperature on creativity is far more nuanced and weak than suggested by the "creativity parameter" claim; overall results suggest that the LLM generates slightly more novel outputs as temperatures get higher.”

A high temperature value can make model outputs seem more creative but it's more accurate to think of them as being less determined by the training data.