DCAgent2/swebench_verified_random_100_folders_rl_r2egym_nl2bash_stack_bugsseq_fixthink_a7e78a3c7 Updated about 1 hour ago
DCAgent2/terminal_bench_2_exp_psu_stackoverflow_10K_glm_4_7_traces_20260311_170344 Updated about 2 hours ago
DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_neulab_agenttuning_webshop_sandbod07c3d59 Updated about 3 hours ago
DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_neulab_agenttuning_kg_sandboxes_me5f27cd1 Updated about 3 hours ago
DCAgent2/terminal_bench_2_exp_psu_stackoverflow_316_glm_4_7_traces_20260311_170339 Updated about 3 hours ago
DCAgent2/medagentbench_laion_r2egym-nl2bash-stack-bugsseq Viewer • Updated about 5 hours ago • 899 • 9
DCAgent2/terminal_bench_2_exp_tas_timeout_multiplier_1_0_traces_20260311_010108 Updated about 9 hours ago