Arcee Nova โ FP8
FP8-quantized version of Arcee Nova, a 72B-parameter Qwen2-based model from Arcee AI. FP8 reduces the model size by ~50% while maintaining quality, enabling inference on fewer GPUs.
Video walkthrough: Unlock the Future of Creative Writing with Arcee Nova!
Model Details
| Detail | Value |
|---|---|
| Base model | arcee-ai/Arcee-Nova |
| Architecture | Qwen2 (72B parameters) |
| Quantization | FP8 (8-bit floating point) |
| Model size | ~75 GB (16 shards) |
| Format | Safetensors |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"juliensimon/Arcee-Nova-fp8", torch_dtype="auto", device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("juliensimon/Arcee-Nova-fp8")
- Downloads last month
- 11
Model tree for juliensimon/Arcee-Nova-fp8
Base model
arcee-ai/Arcee-Nova