-
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 486k • • 13k -
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
Viewer • Updated • 110k • 462 • 725 -
The Ultra-Scale Playbook
🌌3.68kThe ultimate guide to training LLM on large GPU Clusters
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627
Sunny Ratnani
SunnyRatnaniMD
·
AI & ML interests
None yet