Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
xuxin
xx18
AI & ML interests
None yet
Recent Activity
authored
a paper
about 24 hours ago
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
upvoted
a
paper
29 days ago
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration