StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published 23 days ago • 71
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 29 days ago • 167
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies Paper • 2508.20072 • Published Aug 27, 2025 • 31
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8, 2025 • 195
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29, 2025 • 136
Running on CPU Upgrade Featured 1.18k Open ASR Leaderboard 🏆 1.18k View and request speech models benchmark data
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models Paper • 2507.13984 • Published Jul 18, 2025 • 25
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models Paper • 2506.19851 • Published Jun 24, 2025 • 60
Discrete Diffusion in Large Language and Multimodal Models: A Survey Paper • 2506.13759 • Published Jun 16, 2025 • 43
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published Jun 5, 2025 • 77
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models Paper • 2505.12504 • Published May 18, 2025 • 24