Adarsh Zolekar's picture
Building on HF

Adarsh Zolekar

adarshzolekar

AI & ML interests

Passionate about AI & ML, Deep Learning and related AI domains. Exploring models, datasets and applications while contributing to the Hugging Face community.

Recent Activity

reacted to mike-ravkine's post with ❤️ 5 days ago
Happy 2026 everyone! I've been busy working on some new ranking/position methodologies and excited to start sharing some results. Plot legends: - X = truncation rate (low = good) - ? = confusion rate (low = good) - blue bars = average completion tokens (low = good) - black diamonds = CI-banded performance (high = good) - cluster squares = models inside this group are equivalent https://huggingface.co/openai/gpt-oss-120b remains the king in all dimensions of interest: truncation rates, completion lengths and performance. If I had but one complaint it's the reason_effort does not seem to actually work - more on this soon. Second is a 3-way tie in performance between the Qwen3-235B-2507 we all know and love with an unexpected entrant - https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct This is a very capable model and it's reasoning effort controls actually works, but you should absolutely not leave it on the default "unlimited" - enable a sensible limit (4k works well for 8k context length). Third place is another 3-way tie, this one between Seed-OSS-36B (it straddles the CI boundary between 2nd and 3rd place), https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct (demonstrating that full attention may be overrated after all and gated is the way to go) and the newly released https://huggingface.co/zai-org/GLM-4.7 which offers excellent across the board performance with some of the shortest reasoning traces I've seen so far.
View all activity

Organizations

MLX Community's profile picture Hugging Face MCP Course's profile picture