Arctic-AWM-8B

Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

Zhaoyang Wang¹, Canwen Xu², Boyi Liu², Yite Wang², Siwei Han¹,
Zhewei Yao², Huaxiu Yao¹, Yuxiong He²

¹UNC-Chapel Hill ²Snowflake AI Research

Overview

Arctic-AWM-8B is a multi-turn tool-use agent model trained with agentic reinforcement learning on Qwen3-8B, using the fully synthetic environments from AgentWorldModel-1K.

The model is trained to interact with tool-use environments exposed via a unified MCP (Model Context Protocol) interface, enabling strong multi-turn agentic capabilities.

For detailed usage of the model, please visit https://github.com/Snowflake-Labs/agent-world-model.

Resources

Related resources are also available, please check:

Resource	Link
📄 Paper	📄 arxiv.org/abs/2602.10090
💻 Code	💻 Snowflake-Labs/agent-world-model
📦 AgentWorldModel-1K	🤗 Snowflake/AgentWorldModel-1K
🤖 Arctic-AWM-4B	🤗 Snowflake/Arctic-AWM-4B
🤖 Arctic-AWM-8B	🤗 Snowflake/Arctic-AWM-8B
🤖 Arctic-AWM-14B	🤗 Snowflake/Arctic-AWM-14B

Citation

If you find this resource useful, please kindly cite:

@article{wang2026agentworldmodelinfinity,
      title={Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning}, 
      author={Zhaoyang Wang and Canwen Xu and Boyi Liu and Yite Wang and Siwei Han and Zhewei Yao and Huaxiu Yao and Yuxiong He},
      year={2026},
      eprint={2602.10090},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2602.10090}, 
}