Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning Paper β’ 2602.09439 β’ Published 1 day ago β’ 10
VIDEOP2R: Video Understanding from Perception to Reasoning Paper β’ 2511.11113 β’ Published Nov 14, 2025 β’ 111
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper β’ 2510.23607 β’ Published Oct 27, 2025 β’ 179
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality Paper β’ 2505.18227 β’ Published May 23, 2025 β’ 15
DeepCritic: Deliberate Critique with Large Language Models Paper β’ 2505.00662 β’ Published May 1, 2025 β’ 54