richardlian (Richard Lian)

upvoted an article 3 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

Dec 1, 2025

•

302

upvoted an article 4 months ago

Article

Sentence Transformers is joining Hugging Face!

Oct 22, 2025

•

87

upvoted an article 5 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

Oct 1, 2025

•

137

upvoted a collection 5 months ago

The Big Benchmarks Collection

Collection

Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 261

upvoted a paper 7 months ago

Inverse Scaling in Test-Time Compute

Paper • 2507.14417 • Published Jul 19, 2025 • 28

upvoted an article 9 months ago

Article

KV Cache from scratch in nanoVLM

+3

Jun 4, 2025

•

112

upvoted a paper 9 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 188

upvoted a paper 10 months ago

Parallel Scaling Law for Language Models

Paper • 2505.10475 • Published May 15, 2025 • 83

upvoted 2 articles 10 months ago

Article

The Transformers Library: standardizing model definitions

+2

May 15, 2025

•

121

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

598

upvoted a collection 10 months ago

Unsloth Dynamic 2.0 Quants

Collection

New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 76 items • Updated 4 days ago • 396

upvoted 3 articles 11 months ago

Article

Introducing HELMET: Holistically Evaluating Long-context Language Models

+5

Apr 16, 2025

•

42

Article

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Mar 17, 2025

•

353

Article

Rearchitecting Hugging Face Uploads and Downloads

+1

Nov 26, 2024

•

50

upvoted 3 articles 12 months ago

Article

From Files to Chunks: Improving HF Storage Efficiency

Nov 20, 2024

•

70

Article

Xet is on the Hub

+4

Mar 18, 2025

•

79

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

276

upvoted 2 papers about 1 year ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17, 2025 • 115

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16, 2025 • 41

upvoted an article about 1 year ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15, 2025

•

226

Richard Lian

AI & ML interests

Organizations

Transformers v5: Simple model definitions powering the AI ecosystem

Sentence Transformers is joining Hugging Face!

Introducing RTEB: A New Standard for Retrieval Evaluation

The Big Benchmarks Collection

Inverse Scaling in Test-Time Compute

KV Cache from scratch in nanoVLM

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Parallel Scaling Law for Language Models

The Transformers Library: standardizing model definitions

Vision Language Models (Better, faster, stronger)

Unsloth Dynamic 2.0 Quants

Introducing HELMET: Holistically Evaluating Long-context Language Models

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Rearchitecting Hugging Face Uploads and Downloads

From Files to Chunks: Improving HF Storage Efficiency

Xet is on the Hub

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Evolving Deeper LLM Thinking

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Train 400x faster Static Embedding Models with Sentence Transformers

Richard Lian

AI & ML interests

Organizations

richardlian's activity

Transformers v5: Simple model definitions powering the AI ecosystem

Sentence Transformers is joining Hugging Face!

Introducing RTEB: A New Standard for Retrieval Evaluation

KV Cache from scratch in nanoVLM

The Transformers Library: standardizing model definitions

Vision Language Models (Better, faster, stronger)

Introducing HELMET: Holistically Evaluating Long-context Language Models

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Rearchitecting Hugging Face Uploads and Downloads

From Files to Chunks: Improving HF Storage Efficiency

Xet is on the Hub

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Train 400x faster Static Embedding Models with Sentence Transformers