Arcee Nova โ€” FP8

FP8-quantized version of Arcee Nova, a 72B-parameter Qwen2-based model from Arcee AI. FP8 reduces the model size by ~50% while maintaining quality, enabling inference on fewer GPUs.

Video walkthrough: Unlock the Future of Creative Writing with Arcee Nova!

Model Details

Detail Value
Base model arcee-ai/Arcee-Nova
Architecture Qwen2 (72B parameters)
Quantization FP8 (8-bit floating point)
Model size ~75 GB (16 shards)
Format Safetensors

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "juliensimon/Arcee-Nova-fp8", torch_dtype="auto", device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("juliensimon/Arcee-Nova-fp8")
Downloads last month
11
Safetensors
Model size
73B params
Tensor type
BF16
ยท
F8_E4M3
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for juliensimon/Arcee-Nova-fp8

Quantized
(5)
this model