Michael Feldman's picture

In a Training Loop 🔄

146 772

Michael Feldman

mfeldman143

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

nvidia/C-RADIOv4-H

liked a dataset 5 days ago

moonworks/lunara-aesthetic-image-variations

liked a model 6 days ago

ACE-Step/Ace-Step1.5

View all activity

Organizations

upvoted 2 papers 6 days ago

MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Paper • 2601.21468 • Published 14 days ago • 20

Generative Visual Code Mobile World Models

Paper • 2602.01576 • Published 10 days ago • 39

upvoted an article 6 days ago

Article

Introducing NVIDIA Cosmos Policy for Advanced Robot Control

13 days ago

•

38

upvoted a paper 8 days ago

DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13, 2025 • 27

upvoted a paper 11 days ago

TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers

Paper • 2601.14133 • Published 22 days ago • 60

upvoted a paper 15 days ago

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published 20 days ago • 16

upvoted a paper 18 days ago

Behavior Knowledge Merge in Reinforced Agentic Models

Paper • 2601.13572 • Published 23 days ago • 24

upvoted a collection 19 days ago

Qwen3-TTS

7 items • Updated 21 days ago • 286

upvoted a paper 19 days ago

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Paper • 2601.16163 • Published 20 days ago • 13

upvoted a collection 20 days ago

VibeVoice

Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 9 items • Updated 21 days ago • 207

upvoted 2 papers 20 days ago

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 143

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

Paper • 2601.14251 • Published 22 days ago • 24

upvoted a paper 21 days ago

Language of Thought Shapes Output Diversity in Large Language Models

Paper • 2601.11227 • Published 27 days ago • 9

upvoted a collection 22 days ago

BigVGAN

BigVGAN is a universal neural vocoder that generates audio waveform using mel spectrogram as input. • 11 items • Updated 7 days ago • 16

upvoted an article 23 days ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

23 days ago

•

80

upvoted a paper 25 days ago

MIRIAD: Augmenting LLMs with millions of medical query-response pairs

Paper • 2506.06091 • Published Jun 6, 2025 • 11

upvoted an article 25 days ago

Article

How We Built a Semantic Highlight Model To Save Token Cost for RAG

28 days ago

•

65

upvoted a paper 26 days ago

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published Jan 9 • 45

upvoted a collection 26 days ago

TranslateGemma

3 items • Updated 27 days ago • 207

upvoted an article 26 days ago

Article

Open Responses: What you need to know

+2

28 days ago

•

105