Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
published
a model
about 9 hours ago
mehuldamani/fromRLVR_qwen3_8b_medical_rlcr_multiple
published
a model
about 17 hours ago
mehuldamani/format_train_rlvr_qwen3_8b_medical_rlcr_multiple
published
a model
about 21 hours ago
mehuldamani/qwen3_8b_medical_rlcr_multiple_zeroBrierWeight
Organizations
None yet