๐Ÿ›‘ Answer Convergence as a Signal for Early Stopping in Reasoning

Authors: Xin Liu, Lu Wang (University of Michigan)

Code & Resources: GitHub Repository | ArXiv Paper

๐Ÿ’ก Demo Description

This interactive demo illustrates the core concept of our Early Stopping strategy.

  • Left Panel: Shows the model's full Chain-of-Thought (CoT) reasoning process.
  • Right Panel: Shows the reasoning process truncated by our method.

Key Insight: Models often reach Answer Convergence (the correct answer) well before completing the full reasoning chain. Subsequent steps are often redundant self-verification, which can be safely skipped to reduce inference costs.

Select a Test Case

๐Ÿข Original (Full CoT)

๐Ÿ‡ Our Method (Early Stopping)


๐Ÿ“Š Key Results (from Paper)

Our experiments across five benchmarks (including NQ, GSM8K, GPQA) reveal substantial redundancy in standard CoT:

  • NaturalQuestions (NQ): Token reduction of over 40% with improved accuracy using Learn-to-Stop.
  • GSM8K: Token reduction of ~45% with minimal or no accuracy drop.
  • Methods: We propose three strategies: Answer Consistency (Unsupervised), Think Token Adjustment (Unsupervised), and Learn-to-Stop (Supervised).