GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper • 2511.11653 • Published Nov 10, 2025 • 57
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated Dec 10, 2025 • 343k • 1.57k
SimpleRL-Zoo Collection The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild" • 13 items • Updated May 5, 2025 • 8
nishadsinghi/math7500_train_solutions_DeepSeek-R1-Distill-Qwen-7B_32K_tokens Viewer • Updated Feb 13, 2025 • 7.45k • 8 • 2