Nacrith GPU

Neural Arithmetic Compression -- Advanced Lossless Compression

Website | GitHub | SmolLM2-135M + Arithmetic Coding | Supports text & binary files

Information is Already There

Compress text using neural arithmetic coding (NC05 format).

Try these examples

How it works: A 135M-parameter language model predicts the next token at each step. Those predictions feed an arithmetic coder -- high-confidence predictions cost nearly zero bits. The same model runs on both sides, guaranteeing perfect lossless reconstruction.

Text (NC05): Text is tokenized and neural-compressed directly. Achieves ~15% ratio on English text (2.5x better than gzip).

Binary (NC06): Files are segmented into text-like and binary regions. Text regions are neural-compressed; binary regions are compressed with gzip or lzma. The hybrid approach beats gzip on files with significant text content.

Apache 2.0 | Made by Roberto Tacconelli | arxiv.org/abs/2602.19626 | tacconelli.rob@gmail.com | roberto@elizetaplus.com