Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 11 days ago • 87
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8 Text Generation • 50B • Updated Oct 15, 2025 • 1.1k • 23
Llama Nemotron Collection Open, Production-ready Enterprise Models • 12 items • Updated 11 days ago • 75
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models Paper • 2505.24133 • Published May 30, 2025 • 1
Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation Paper • 2312.16610 • Published Dec 27, 2023
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers Paper • 2211.16056 • Published Nov 29, 2022 • 4
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models Paper • 2505.24133 • Published May 30, 2025 • 1
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper • 2507.11527 • Published Jul 15, 2025 • 32
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper • 2507.11527 • Published Jul 15, 2025 • 32