Untold Stories of Intellectual Property: The Magic Key to 2X AI Creativity: The Complete Guide to 'Verbalized Sampling' Probability Distribution

Q: What does the displayed probability value (e.g., 0.45) mean, and why isn’t it close to 1.0?

👉 This value represents the Relative Ratio of selection compared to the model's overall expected distribution. VS mitigates mode collapse by forcing the model to distribute probability mass across various candidates, meaning 0.45 is an example of the most dominant relative distributional share.

Q: What types of LLM tasks are most effectively improved by Verbalized Sampling?

👉 It is most effective in tasks where multiple valid answers exist (multi-answer domains), such as creative writing, dialogue simulation, and Open-Ended QA.

Q: What are the advantages of setting a low probability threshold?

👉 Lowering the threshold prompts the model to generate rare, creative, or exceptional answers (the long-tail of the distribution) that it typically wouldn't choose.

Monday, October 20, 2025

The Magic Key to 2X AI Creativity: The Complete Guide to 'Verbalized Sampling' Probability Distribution

Solving AI’s ‘Mode Collapse’ with Just One Prompt Line! Tired of LLM’s predictable answers? A deep dive into the Distributional Probability Principle and application of ‘Verbalized Sampling,’ the technique radically boosting AI creativity.

Have you ever been disappointed expecting fresh ideas or diverse answers from Large Language Models (LLMs), only to get similar, predictable results every time? Ask an AI for a joke, and you often get a familiar, repeated response. This phenomenon is what AI researchers call ‘mode collapse.’

Is this really a technical limitation of AI? Recent research by Zhang revealed the surprising cause of this mystery and an incredibly simple solution: ‘Verbalized Sampling.’ In this post, we’ll dive deep into the principles of this powerful ‘Distribution-Level Prompt’ strategy for unlocking AI's creativity.

1. The Real Culprit Isn't the AI, It's Our ‘Bias for Familiarity’

The core reason LLMs default to repetitive answers is, ironically, a human bias embedded in the training data: what we call the ‘Typicality Bias.’

Because of this human bias, during the fine-tuning process (RLHF), human evaluators subconsciously rate predictable, ‘safe’ answers higher than novel, creative ones. As this feedback accumulates, the model suffers mode collapse, concentrating its probability mass onto the most typical answer—the ‘Mode.’ That’s why you get the same joke five times.

💡 The Mode Collapse Analogy
It’s like a chef repeatedly recommending only steak, the dish customers order most. Although the model can create diverse dishes (candidate responses), it focuses only on the most typical one, losing diversity (creativity).

2. How to Awaken Dormant Creativity: Demand a ‘Menu with Probabilities’

Verbalized Sampling (VS) is a prompt strategy designed to fix this mode collapse by asking the LLM to “explicitly verbalize the response distribution and corresponding probabilities.” Researchers term this a ‘Distribution-Level Prompt.’

Probability Meaning: A ‘Relative Distribution Ratio,’ Not the Correct Answer Probability

The probability value VS presents (e.g., 0.45) is not the objective probability of being correct (which should be near 1.0). Instead, this value represents the Relative Ratio (Distributional Likelihood) of that response being selected among the candidates the model generated, quantifying how plausible and natural the model considers the answer internally.

Chef Analogy: Applying VS is like asking the chef to show you the full expected order distribution:

“Today’s recommendations are: Steak (probability 0.45), Pasta (0.25), Sushi (0.20)...”

In actual creative tasks, the probability of the most dominant candidate might be lower, like 0.15. The 0.45 is an example of the most dominant relative distributional share; it’s a key metric that helps the LLM distribute probability mass across various responses, restoring the original pre-trained distribution.

LLM Response Style and Probability Meaning Comparison

Category	Standard LLM (Direct Prompting)	Verbalized Sampling (VS Approach)
Probability Distribution State	Probability mass concentrated on the Mode (Mode Collapse)	Probability mass distributed among various candidates (Distribution Restoration)
Meaning of the Probability Value	(For multi-choice, etc.) Approaching the probability of being correct (∼ 0.99)	The most dominant Relative Ratio of the Distribution among diverse candidates (≪ 1.0)
Primary Use Case	Fact-based QA	Creative Writing, Open-Ended QA

📝 Verbalized Sampling (VS) Prompt Instruction Example

When applying VS, you must include a structural instruction telling the AI to explicitly list the ‘candidate ideas and their probabilities’ before generating the final answer.

<instructions>
Generate 5 responses to the user query, each within a separate <response> tag.
Each <response> must include a <text> and a numeric <probability> (option: within the range [0.0, 1.0]).
Randomly sample the final response from these 5 options, considering the probability.
</instructions>

Key: Use "instructions" tags or similar methods to enforce the AI’s thought process.
Effect: The AI is forced to consider diverse answers (low probability) in addition to the most typical one (high probability).

3. The Smarter the AI, the More Explosive the Effect: Diversity Control via Probability Thresholds

The most surprising discovery of the VS technique is the ‘Emergent Trend’: the larger and more capable the model, the more dramatic the effect. Research shows cutting-edge large models like GPT-4 saw a diversity improvement that was 1.5 to 2 times greater than smaller models. This suggests VS can be the ‘key’ to fully unlocking the hidden creativity in the most powerful AI models.

🚀 Tune AI Creativity Like a Dial (Diversity Tuning)
A major advantage of VS is the ability to directly control the output diversity level by setting a probability threshold.

By instructing the AI to “randomly sample from the long-tail portion of the distribution where the response probability is below {threshold},” the AI is prompted to generate rare, creative answers it wouldn’t typically select. Lowering this threshold increases originality.

Conclusion: Explore AI’s Potential with ‘Distribution-Level Prompts’

‘Verbalized Sampling’ is a powerful, yet simple, solution that addresses mode collapse stemming not from AI limitations, but from the human ‘Typicality Bias.’ This technique is applicable to models without additional training and maximizes the creativity of high-performance models.

This discovery represents a fundamental paradigm shift in how we interact with AI. We are moving past the era of ‘commanding’ a single answer from AI, into one where we collectively ‘explore’ the vast possibilities of its knowledge.

🧠

Verbalized Sampling Summary Card

1. Root Cause: Not an AI issue, but the human ‘Typicality Bias.’

2. Core Mechanism: Forces AI to verbalize its probability distribution (Distribution-Level Prompt).

3. Probability 0.45 Meaning: An example of the most dominant relative distributional share among creative candidates.

4. User Control: Diversity tuning is possible by setting a probability threshold in the prompt.

The simplest, most powerful way to unleash AI's hidden creativity!

Frequently Asked Questions (FAQ)

Q: What does the displayed probability value (e.g., 0.45) mean, and why isn’t it close to 1.0?

A: This value represents the Relative Ratio of selection compared to the model's overall expected distribution. VS mitigates mode collapse by forcing the model to distribute probability mass across various candidates. Therefore, it is not the probability of being correct (like 99% in a multiple-choice), but an example of the most dominant relative distributional share in multi-answer domains.

Q: What types of LLM tasks are most effectively improved by Verbalized Sampling?

A: It is most effective in tasks where multiple valid answers exist (multi-answer domains), such as creative writing, dialogue simulation, and Open-Ended QA. It significantly boosts diversity in these tasks.

Q: What are the advantages of setting a low probability threshold?

A: Lowering the threshold prompts the model to generate rare, creative, or exceptional answers (the long-tail of the distribution) that it typically wouldn't choose. This is extremely useful for generating original ideas.

We are moving past the era of ‘commanding’ a single answer from AI, into one where we collectively ‘explore’ the vast possibilities of its knowledge. In your next prompt, try applying this powerful Verbalized Sampling technique to unleash your AI’s hidden creativity! If you have any questions or VS tips of your own, please share them in the comments! 😊

Untold Stories of Intellectual Property