Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better Paper • 2602.05393 • Published 11 days ago • 7
Mano: Restriking Manifold Optimization for LLM Training Paper • 2601.23000 • Published 17 days ago • 2
PISA: Piecewise Sparse Attention Is Wiser for Efficient Diffusion Transformers Paper • 2602.01077 • Published 15 days ago • 3
view article Article Backbone-Optimizer Coupling Bias: The Hidden Co-Design Principle Dec 20, 2025 • 4
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper • 2503.09662 • Published Mar 12, 2025 • 33