Alessandro Ercolani's picture

Alessandro Ercolani PRO

giux78

·

https://alessandroercolani.webflow.io/

AI & ML interests

NLP, Reinforcement Learning, Semantics, Computational Neuroscience

Recent Activity

liked a dataset 3 days ago

togethercomputer/CoderForge-Preview

reacted to their post with 🔥 5 days ago

Together with @mferraretto and @efederici we released #Nesso-4B, a new model specialized for agentic workflows. https://huggingface.co/mii-llm/nesso-4B #Nesso-4B is a fine-tuned version of Qwen-4B, trained on a highly curated and balanced dataset designed specifically for multilingual agentic workflows and conversational use cases. As shown in the video below we simulate, the new “cowork” from #Antrophic, without any data sharing all running on a consumer device. The model can be used to build agentic behavior in #privateAI environments. Not every problem requires super intelligence: in many cases, intelligence at the edge is more than enough. #Nesso4B #AgenticAI #PrivateAI #EdgeAI #OnDeviceAI

reacted to robtacconelli's post with 🚀 5 days ago

🏆 Nacrith: a 135M model that out-compresses everything on natural language What if a tiny LM could compress english text better than _every_ compressor out there — classical or neural, small or large? Nacrith pairs SmolLM2-135M with an ensemble of online predictors and high-precision arithmetic coding. What's inside The standard LLM+arithmetic coding approach wastes ~75% of CDF precision on large vocabularies. Our CDF-24 fix alone recovers 0.5 bpb. On top: a token N-gram that skips the GPU on predictable tokens, an adaptive bias head, llama.cpp backend (7× faster than PyTorch), multi-GPU parallel compression, and a binary file format (NC06) — the first LLM-based binary compressor we know of. Runs on a GTX 1050 Ti. ~500 MB weights, ~1.2 GB VRAM per worker. 💻 Code: https://github.com/robtacconelli/Nacrith-GPU ⭐ Space: https://huggingface.co/spaces/robtacconelli/Nacrith-GPU 📄 Paper: https://huggingface.co/papers/2602.19626 Try it, break it, share your results — all feedback welcome. ⭐ on the repo appreciated! Results across all systems we tested: - alice29.txt → 0.918 bpb (−44% vs CMIX, −20% vs ts_zip) — below the 2nd-order Shannon entropy bound - enwik8 (100 MB) → 0.9389 bpb (−8% vs FineZip/LLMZip's 8B model, −15% vs ts_zip) - Unseen text → 0.723 bpb on a doc published after training cutoff — no memorization, 26% better than FineZip/LLMZip on the same model SmolLM2-135M by https://huggingface.co/HuggingFaceTB

View all activity

Organizations

giux78 's models 52

giux78/test1B_76000

2B • Updated Aug 12, 2025

giux78/test1B_32000

2B • Updated Aug 11, 2025

giux78/test_544000

0.2B • Updated Aug 10, 2025

giux78/test_480000

0.2B • Updated Aug 6, 2025

giux78/test_pre

0.2B • Updated Aug 4, 2025 • 1

giux78/test_minerva_checkpoint-3189

7B • Updated Dec 16, 2024

giux78/test_minerva_checkpoint

Updated Dec 15, 2024

giux78/llama3-8B-usenet-merged

Text Generation • 8B • Updated Apr 29, 2024 • 1.39k • 1

giux78/llama3-usenet

Updated Apr 27, 2024 • 2

giux78/zefiro-funcioncalling-v0.3-merged

Text Generation • 7B • Updated Apr 22, 2024 • 5 • 1

giux78/zefiro-functioncalling-v0.3

Updated Apr 17, 2024

giux78/zefiro-funcioncalling-v0.2-merged

Text Generation • 7B • Updated Apr 15, 2024 • 1

giux78/zefiro-functioncalling-v0.2

Updated Apr 15, 2024

giux78/zefiro-funcioncalling-merged

Text Generation • 7B • Updated Apr 14, 2024

giux78/zefiro-functioncalling

Updated Apr 13, 2024

giux78/gemma-2b-sft-ita

Text Generation • 3B • Updated Feb 28, 2024 • 3

giux78/zefiro-7b-dpo-qlora-ITA-v0.7

Text Generation • 7B • Updated Feb 14, 2024 • 1.31k •

giux78/zefiro-7b-dpo-qlora-ITA-v0.5

Text Generation • 7B • Updated Feb 6, 2024 • 1

giux78/zefiro-7b-sft-qlora-ITA-v0.5-GGUF

Updated Feb 2, 2024

giux78/zefiro-7b-sft-qlora-ITA-v0.5

Text Generation • 7B • Updated Feb 1, 2024 • 1.31k •

giux78/zefiro-7b-beta-ITA-v0.1-GGUF

Text Generation • 7B • Updated Jan 13, 2024 • 11 • 3

giux78/zefiro-7b-beta-ITA-v0.1

Text Generation • 7B • Updated Jan 12, 2024 • 1.31k • • 10