Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 2 days ago • 5
Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection Paper • 2601.19375 • Published 1 day ago • 5
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models Paper • 2601.19834 • Published 1 day ago • 19
AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking Paper • 2601.17645 • Published 4 days ago • 19
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published 2 days ago • 60
One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment Paper • 2601.18731 • Published 2 days ago • 6
IVRA: Improving Visual-Token Relations for Robot Action Policy with Training-Free Hint-Based Guidance Paper • 2601.16207 • Published 6 days ago • 7
Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts Paper • 2601.17111 • Published 5 days ago • 5
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal Paper • 2601.18081 • Published 3 days ago • 7
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation Paper • 2601.17761 • Published 4 days ago • 9
CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval Paper • 2601.15849 • Published 7 days ago • 12
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published 5 days ago • 29
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published 5 days ago • 29
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation Paper • 2601.17737 • Published 4 days ago • 48
daVinci-Dev: Agent-native Mid-training for Software Engineering Paper • 2601.18418 • Published 3 days ago • 114