aayush garg's picture

In a Training Loop 🔄

10 20

aayush garg PRO

garg-aayush

·

https://aayushgarg.dev/

AI & ML interests

None yet

Recent Activity

published an article 7 days ago

Understanding GRPO: PPO without the critic

upvoted an article 8 days ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

published an article 9 days ago

Deriving the DPO Loss from First Principles

View all activity

Organizations

published an article 7 days ago

Article

Understanding GRPO: PPO without the critic

7 days ago

•

1

published an article 9 days ago

Article

Deriving the DPO Loss from First Principles

9 days ago

•

6

published an article 14 days ago

Article

Deriving the PPO Loss from First Principles

14 days ago

•

33

published an article about 1 month ago

Article

What I Learned Building SFT from the Ground Up

Dec 3, 2025

•

1