Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper
• 2401.02038
• Published
• 65
DocLLM: A layout-aware generative language model for multimodal document
understanding
Paper
• 2401.00908
• Published
• 189
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper
• 2401.01055
• Published
• 55
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper
• 2401.01325
• Published
• 27
A Comprehensive Study of Knowledge Editing for Large Language Models
Paper
• 2401.01286
• Published
• 21
Improving Text Embeddings with Large Language Models
Paper
• 2401.00368
• Published
• 82
Astraios: Parameter-Efficient Instruction Tuning Code Large Language
Models
Paper
• 2401.00788
• Published
• 23
PanGu-π: Enhancing Language Model Architectures via Nonlinearity
Compensation
Paper
• 2312.17276
• Published
• 16
Unicron: Economizing Self-Healing LLM Training at Scale
Paper
• 2401.00134
• Published
• 13
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper
• 2401.01854
• Published
• 11
TinyLlama: An Open-Source Small Language Model
Paper
• 2401.02385
• Published
• 95
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper
• 2401.02415
• Published
• 54
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper
• 2401.02412
• Published
• 38
ICE-GRT: Instruction Context Enhancement by Generative Reinforcement
based Transformers
Paper
• 2401.02072
• Published
• 11
DocGraphLM: Documental Graph Language Model for Information Extraction
Paper
• 2401.02823
• Published
• 36
TrustLLM: Trustworthiness in Large Language Models
Paper
• 2401.05561
• Published
• 69
Transformers are Multi-State RNNs
Paper
• 2401.06104
• Published
• 39
TOFU: A Task of Fictitious Unlearning for LLMs
Paper
• 2401.06121
• Published
• 20
Patchscope: A Unifying Framework for Inspecting Hidden Representations
of Language Models
Paper
• 2401.06102
• Published
• 22
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
• 2401.06080
• Published
• 28
Efficient LLM inference solution on Intel GPU
Paper
• 2401.05391
• Published
• 11
Tuning LLMs with Contrastive Alignment Instructions for Machine
Translation in Unseen, Low-resource Languages
Paper
• 2401.05811
• Published
• 8
A Shocking Amount of the Web is Machine Translated: Insights from
Multi-Way Parallelism
Paper
• 2401.05749
• Published
• 9
The Impact of Reasoning Step Length on Large Language Models
Paper
• 2401.04925
• Published
• 18
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Paper
• 2401.05033
• Published
• 18