Danish NER XLM-RoBERTa (v8)
State-of-the-art Named Entity Recognition model for Danish, fine-tuned from XLM-RoBERTa.
Updated 2026-02-03: Now v8 with 91.02% F1 (previously 84.6%)
Performance
| Benchmark | F1 Score |
|---|---|
| DaNE (validation) | 91.02% |
| Previous version | 84.6% |
| nbailab baseline | 87.09% |
Quick Start
from transformers import pipeline
ner = pipeline("ner", model="thomasbeste/danish-ner-xlmr-base", aggregation_strategy="simple")
result = ner("Anders Jensen arbejder hos Novo Nordisk i København.")
for entity in result:
print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.2f})")
Entity Types
| Label | Description | Example |
|---|---|---|
PER |
Person names | Anders Jensen |
ORG |
Organizations | Novo Nordisk A/S |
LOC |
Locations | København |
MISC |
Miscellaneous | Dansk |
Training Data
- DaNE (4.4k samples)
- WikiANN Danish (20k samples)
- NorNE Norwegian (30k samples)
- High-quality synthetic data (60k samples)
License
MIT
- Downloads last month
- 135
Datasets used to train thomasbeste/danish-ner
Evaluation results
- F1 on DaNEvalidation set self-reported0.910