GLiClass models converted to ONNX format, as well as 8bit quantization
Carlo Moro
cnmoro
AI & ML interests
None yet
Recent Activity
reacted
to
robtacconelli's
post with š„ about 4 hours ago
š Nacrith: a 135M model that out-compresses everything on natural language
What if a tiny LM could compress english text better than _every_ compressor out there ā classical or neural, small or large?
Nacrith pairs SmolLM2-135M with an ensemble of online predictors and high-precision arithmetic coding.
What's inside
The standard LLM+arithmetic coding approach wastes ~75% of CDF precision on large vocabularies. Our CDF-24 fix alone recovers 0.5 bpb. On top: a token N-gram that skips the GPU on predictable tokens, an adaptive bias head, llama.cpp backend (7Ć faster than PyTorch), multi-GPU parallel compression, and a binary file format (NC06) ā the first LLM-based binary compressor we know of.
Runs on a GTX 1050 Ti. ~500 MB weights, ~1.2 GB VRAM per worker.
š» Code: https://github.com/robtacconelli/Nacrith-GPU
ā Space: https://huggingface.co/spaces/robtacconelli/Nacrith-GPU
š Paper: https://huggingface.co/papers/2602.19626
Try it, break it, share your results ā all feedback welcome. ā on the repo appreciated!
Results across all systems we tested:
- alice29.txt ā 0.918 bpb (ā44% vs CMIX, ā20% vs ts_zip) ā below the 2nd-order Shannon entropy bound
- enwik8 (100 MB) ā 0.9389 bpb (ā8% vs FineZip/LLMZip's 8B model, ā15% vs ts_zip)
- Unseen text ā 0.723 bpb on a doc published after training cutoff ā no memorization, 26% better than FineZip/LLMZip on the same model
SmolLM2-135M by https://huggingface.co/HuggingFaceTB liked
a Space about 4 hours ago
robtacconelli/Nacrith-GPU liked
a model 1 day ago
badaramoni/wave-field-v4-825m