Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously Paper • 2603.12262 • Published 22 days ago • 30
Flash-KMeans: Fast and Memory-Efficient Exact K-Means Paper • 2603.09229 • Published 24 days ago • 82
TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions Paper • 2602.08711 • Published Feb 9 • 28
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published Feb 3 • 49
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published Feb 4 • 49
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published Jan 24 • 34
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding Paper • 2601.14724 • Published Jan 21 • 75
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 215
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice Paper • 2601.05175 • Published Jan 8 • 36
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 229
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published Dec 23, 2025 • 42
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published Dec 15, 2025 • 111
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning Paper • 2512.02835 • Published Dec 2, 2025 • 10