BitNet-TRNQ / README.md
DarkSca's picture
Upload folder using huggingface_hub
cb9c894 verified
metadata
license: mit
tags:
  - bitnet
  - ternary
  - trillim
  - cpu-inference
base_model: microsoft/bitnet-b1.58-2B-4T-bf16

BitNet-TRNQ

Ternary-quantized version of microsoft/bitnet-b1.58-2B-4T-bf16, packaged for the Trillim DarkNet inference engine.

This model runs entirely on CPU — no GPU required.

Model Details

Architecture BitNet (BitNetForCausalLM)
Parameters ~2B
Hidden size 2560
Layers 30
Attention heads 20 (5 KV heads)
Context length 4096
Quantization Ternary ({-1, 0, 1})
Source model microsoft/bitnet-b1.58-2B-4T-bf16
License MIT

Usage

pip install trillim
trillim pull Trillim/BitNet-TRNQ
trillim serve Trillim/BitNet-TRNQ

This starts an OpenAI-compatible API server at http://127.0.0.1:8000.

For interactive CLI chat:

trillim chat Trillim/BitNet-TRNQ

What's in this repo

File Description
qmodel.tensors Ternary-quantized weights in Trillim format
rope.cache Precomputed RoPE embeddings
config.json Model configuration
tokenizer.json Tokenizer
tokenizer_config.json Tokenizer configuration
trillim_config.json Trillim metadata

License

This model is released under the MIT License, following the license of the source model.