Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning Paper • 2303.11101 • Published Mar 20, 2023 • 1
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding Paper • 2310.05424 • Published Oct 9, 2023 • 1
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Paper • 2410.20672 • Published Oct 28, 2024 • 6
Why In-Context Learning Transformers are Tabular Data Classifiers Paper • 2405.13396 • Published May 22, 2024
Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models Paper • 2410.10166 • Published Oct 14, 2024
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published Jul 14, 2025 • 70
Hybrid Architectures for Language Models: Systematic Analysis and Design Insights Paper • 2510.04800 • Published Oct 6, 2025 • 36
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4, 2024 • 41