What is Temperature?

Temperature is a hyperparameter in large language models that controls the randomness and creativity of generated outputs, where lower values produce more deterministic and focused responses while higher values increase diversity and creativity.

Quick Facts

Full NameLLM Temperature Parameter
CreatedConcept originated from statistical mechanics, applied to NLP in 2010s
SpecificationOfficial Specification

How It Works

Temperature is one of the most important parameters controlling LLM behavior. It works by scaling the logits (raw prediction scores) before applying softmax to compute token probabilities. At temperature 0, the model always selects the most probable token (greedy decoding), while higher temperatures flatten the probability distribution, giving less likely tokens a better chance of being selected. Most APIs use a range of 0 to 2, with 0.7-1.0 being common defaults.

Key Characteristics

  • Controls randomness in token selection during generation
  • Lower values (0-0.3) produce consistent, focused outputs
  • Higher values (0.8-1.5) increase creativity and diversity
  • Temperature 0 enables deterministic, reproducible outputs
  • Works in conjunction with top_p and top_k parameters
  • Different tasks require different optimal temperature settings

Common Use Cases

  1. Code generation (low temperature for accuracy)
  2. Creative writing (high temperature for diversity)
  3. Factual Q&A (low temperature for consistency)
  4. Brainstorming (high temperature for diverse ideas)
  5. Translation (low temperature for accuracy)

Example

loading...
Loading code...

Frequently Asked Questions

What is the difference between temperature 0 and temperature 1?

At temperature 0, the model uses greedy decoding, always selecting the highest probability token, producing deterministic and reproducible outputs ideal for tasks requiring consistency like code generation. At temperature 1, the model samples according to the original probability distribution, producing more diverse and creative outputs suitable for creative writing. Higher temperatures give lower probability tokens a greater chance of being selected.

How do temperature and top_p parameters relate? How should they be used together?

Both temperature and top_p control output randomness but through different mechanisms. Temperature scales logits to affect probability distribution, while top_p (nucleus sampling) only samples from the most likely tokens whose cumulative probability reaches p. It's generally recommended to adjust only one parameter at a time, as adjusting both can produce unpredictable effects. OpenAI suggests setting top_p to 1 if adjusting temperature, and vice versa.

What temperature values should be used for different tasks?

For tasks requiring accuracy like code generation, math, and factual Q&A, use low temperature (0-0.3). For general conversation and text summarization, use medium temperature (0.5-0.7). For creative writing, brainstorming, and poetry, use higher temperature (0.8-1.2). The optimal value should be adjusted based on actual results.

Why is the parameter called 'temperature'? What is the origin of this name?

The term temperature is borrowed from the Boltzmann distribution in statistical mechanics. In physics, temperature controls the distribution of particle energy states: at low temperatures, particles concentrate in low energy states; at high temperatures, the distribution is more uniform. Similarly, in language models, low temperature concentrates output on high-probability tokens while high temperature flattens the distribution. This mathematical similarity led to the same naming.

What problems occur when temperature is set too high?

Excessively high temperature (e.g., above 1.5) causes output quality degradation: text may become incoherent, contain grammatical errors, produce meaningless content, or drift off-topic. This happens because high temperature gives low-probability (usually inappropriate) tokens more chances to be selected. In extreme cases, output may be completely random gibberish. Therefore, a balance between creativity and coherence must be found.

Related Tools

Related Terms

Related Articles