The LLM temperature hyperparameter is akin to a randomness or creativity dial. Raising the temperature increases the probability distribution for the next words that appear in the model’s output during text generation.
A temperature setting of 1 uses the standard probability distribution for the model. Temperatures higher than 1 flatten the probability distribution, encouraging the model to select a wider range of tokens. Conversely, temperatures lower than 1 widen the probability distribution, making the model more likely to select the most probable next token.
A temperature value closer to 1.0, such as 0.8, means that the LLM gets more creative in its responses, but with potentially less predictability. Meanwhile, a lower temperature of 0.2 will yield more deterministic responses. A model with low temperature delivers predictable, if staid, outputs. Higher temperatures closer to 2.0 can begin to produce nonsensical output.
The use case informs the ideal temperature value for an LLM. A chatbot designed to be entertaining and creative, such as ChatGPT, needs a higher temperature to create human-like text. A text summarization app in a highly regulated field such as law, health or finance requires the inverse—its generated text summaries must adhere to strict requirements.