Self-Consistency
Self-Consistency is an advanced prompting technique designed to improve the reliability and accuracy of language model outputs, especially for complex reasoning tasks.
Instead of relying on a single response, generate multiple answers and pick the most frequent one!
Instead of relying on a single response, the model is prompted multiple times (often with different sampling seeds), and the most consistent answer among the outputs is selected as the final result.
This approach was introduced by Wang et al. (2022) to address the variability and occasional errors in chain-of-thought (CoT) reasoning, particularly for arithmetic and commonsense reasoning tasks.
Use Whenβ
- The task is ambiguous or has multiple possible answers.
- You want to improve reliability by aggregating outputs.
- The modelβs responses vary significantly between runs.
Patternβ
- Prompt the model with the same question multiple times (using temperature sampling or different seeds).
- Collect all generated answers.
- Select the most frequent or consistent answer as the final output.
Examplesβ
Example 1: Arithmetic Reasoning (from Wang et al., 2022)β
Q: When I was 6 my sister was half my age. Now Iβm 70 how old is my sister?
Model outputs (sampled multiple times):
- When I was 6, my sister was 3. Now I am 70, so she is 70 - 3 = 67. The answer is 67.
- When I was 6, my sister was half my age, so she was 3. Now that I am 70, she is 70 - 3 = 67. The answer is 67.
- When I was 6, my sister was half my age, so she was 3. Now I am 70, so she is 70/2 = 35. The answer is 35.
Final answer (by majority):
67
Example 2: Sentiment Classification (new)β
Classify the sentiment of the following review: "The product exceeded my expectations."
Model outputs (sampled multiple times):
- Positive
- Positive
- Positive
- Neutral
Final answer (by majority):
Positive
Benefitsβ
- Reliability: Reduces the impact of random errors or outlier responses.
- Accuracy: Aggregates multiple reasoning paths to find the most likely answer.
- Robustness: Especially useful for tasks with inherent ambiguity or multiple valid solutions.
Pitfallsβ
- Increases computational cost (requires multiple model runs).
- May not resolve ambiguity if the model is inconsistent or biased.
- Not always necessary for simple or deterministic tasks.
Self-Consistency Processβ
The following diagram illustrates how self-consistency works:
Input Problem: "If a train travels 60 miles in 45 minutes,
what's its speed in mph?"
Multiple Sampling Paths
β
ββββββββββββββββ¬ββββββΌββββββ¬βββββββββββββββ
β β β β β
βΌ βΌ βΌ βΌ βΌ
Sample 1 Sample 2 β Sample 3 Sample 4
β
"60 miles in "45 min = β "Speed = "Distance = 60
45 min = 3/4 0.75 hr β distance/ Time = 45 min
hour. Speed = Speed = β time = = 0.75 hr
60 Γ· 0.75 = 60/0.75 = β 60/0.75 = Speed = 60/0.75
80 mph" 80 mph" β 80 mph" = 80 mph"
β β β β β
βΌ βΌ βΌ βΌ βΌ
80 mph 80 mph β 80 mph 80 mph
β
βΌ
βββββββββββββββββββ
β VOTE & SELECT β
β β
β 80 mph: ββββ β β Most frequent
β 75 mph: β β β Outlier
β 85 mph: β β β Outlier
βββββββββββββββββββ
β
βΌ
Final Answer: 80 mph
This approach reduces errors by generating multiple reasoning paths and selecting the most consistent result.
Referencesβ
- Wang, X., Wei, J., Schuurmans, D., Le, Q. V., Chi, E., Narang, S., ... & Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv:2203.11171.