When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains Paper • 2603.01301 • Published 16 days ago • 8
KeDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments Paper • 2504.15364 • Published Apr 21, 2025 • 4
LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation Paper • 2602.11451 • Published Feb 11 • 15
LoopFormer Collection Models trained in the ICLR2026 paper: LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation • 17 items • Updated 27 days ago • 2
PC-GRPO Collection Qwen2.5-VL-3B & 7B models trained with PC-GRPO in the paper: Puzzle Curriculum GRPO for Vision-Centric Reasoning • 9 items • Updated Feb 12 • 3
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior Paper • 2512.20757 • Published Dec 23, 2025 • 18
EasyV2V: A High-quality Instruction-based Video Editing Framework Paper • 2512.16920 • Published Dec 18, 2025 • 18