Papers - Pre-training
updated
Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning
Paper
• 2310.20587
• Published
• 18
Chain-of-Thought Reasoning Without Prompting
Paper
• 2402.10200
• Published
• 109
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
• 2403.15042
• Published
• 27
LIMA: Less Is More for Alignment
Paper
• 2305.11206
• Published
• 27
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive
Summarization
Paper
• 1912.08777
• Published
• 2
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Paper
• 1907.12461
• Published
• 1
Multi-Head Mixture-of-Experts
Paper
• 2404.15045
• Published
• 60
Procedural Knowledge in Pretraining Drives Reasoning in Large Language
Models
Paper
• 2411.12580
• Published
• 2
Studying Large Language Model Generalization with Influence Functions
Paper
• 2308.03296
• Published
• 14
Multimodal Autoregressive Pre-training of Large Vision Encoders
Paper
• 2411.14402
• Published
• 47
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
Representation
Paper
• 2103.06874
• Published
• 2