๐ Answer Convergence as a Signal for Early Stopping in Reasoning
Authors: Xin Liu, Lu Wang (University of Michigan)
Code & Resources: GitHub Repository | ArXiv Paper
๐ก Demo Description
This interactive demo illustrates the core concept of our Early Stopping strategy.
- Left Panel: Shows the model's full Chain-of-Thought (CoT) reasoning process.
- Right Panel: Shows the reasoning process truncated by our method.
Key Insight: Models often reach Answer Convergence (the correct answer) well before completing the full reasoning chain. Subsequent steps are often redundant self-verification, which can be safely skipped to reduce inference costs.
Select a Test Case
๐ข Original (Full CoT)
๐ Our Method (Early Stopping)
๐ Key Results (from Paper)
Our experiments across five benchmarks (including NQ, GSM8K, GPQA) reveal substantial redundancy in standard CoT:
- NaturalQuestions (NQ): Token reduction of over 40% with improved accuracy using Learn-to-Stop.
- GSM8K: Token reduction of ~45% with minimal or no accuracy drop.
- Methods: We propose three strategies: Answer Consistency (Unsupervised), Think Token Adjustment (Unsupervised), and Learn-to-Stop (Supervised).