ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory Paper • 2509.04439 • Published Sep 4, 2025 • 1
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Paper • 2510.12872 • Published Oct 14, 2025 • 4
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published Oct 17, 2025 • 92
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail Paper • 2511.00088 • Published Oct 30, 2025 • 4
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference Paper • 2510.17777 • Published Oct 20, 2025 • 1
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference Paper • 2512.01031 • Published Nov 30, 2025 • 26
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference Paper • 2511.10645 • Published Nov 13, 2025 • 8
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity Paper • 2506.16500 • Published Jun 19, 2025 • 17
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer Paper • 2303.17605 • Published Mar 30, 2023
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing Paper • 2005.14187 • Published May 28, 2020 • 2
MapPrior: Bird's-Eye View Map Layout Estimation with Generative Models Paper • 2308.12963 • Published Aug 24, 2023
FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer Paper • 2301.08739 • Published Jan 20, 2023
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19, 2024 • 52
AMC: AutoML for Model Compression and Acceleration on Mobile Devices Paper • 1802.03494 • Published Feb 10, 2018