AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models Paper โข 2505.22662 โข Published May 28, 2025 โข 6
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper โข 2504.11651 โข Published Apr 15, 2025 โข 31
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper โข 2503.16419 โข Published Mar 20, 2025 โข 77