Arcee Nova — FP8

FP8-quantized version of Arcee Nova, a 72B-parameter Qwen2-based model from Arcee AI. FP8 reduces the model size by ~50% while maintaining quality, enabling inference on fewer GPUs.

Video walkthrough: Unlock the Future of Creative Writing with Arcee Nova!

Model Details

Detail	Value
Base model	arcee-ai/Arcee-Nova
Architecture	Qwen2 (72B parameters)
Quantization	FP8 (8-bit floating point)
Model size	~75 GB (16 shards)
Format	Safetensors

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "juliensimon/Arcee-Nova-fp8", torch_dtype="auto", device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("juliensimon/Arcee-Nova-fp8")

Downloads last month: 11

Safetensors

Model size

73B params

Tensor type

BF16

F8_E4M3

Model tree for juliensimon/Arcee-Nova-fp8

Base model

arcee-ai/Arcee-Nova

Quantized

(5)

this model