SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL Paper • 2601.09136 • Published about 1 month ago • 39
NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control Paper • 2602.09070 • Published 4 days ago • 11
NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control Paper • 2602.09070 • Published 4 days ago • 11
NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control Paper • 2602.09070 • Published 4 days ago • 11
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 2 days ago • 168
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 11 days ago • 125
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published 7 days ago • 68
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 8 days ago • 301
Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making Paper • 2602.06570 • Published 7 days ago • 59
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models Paper • 2601.20354 • Published 16 days ago • 110
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 14 days ago • 98
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 14 days ago • 178
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models Paper • 2601.19834 • Published 17 days ago • 25
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation Paper • 2601.17737 • Published 19 days ago • 55