Legal Contract Ensemble Classifier ποΈβοΈ
State-of-the-art 2-model ensemble for automated contract clause risk classification
π― Model Description
This ensemble combines two specialized transformer models to achieve 97.74% accuracy in classifying legal contract clauses into risk categories. The model helps legal professionals quickly identify potentially problematic clauses in contracts.
Architecture
2-Model Ensemble with Probability Averaging:
Legal-BERT-Base (nlpaueb/legal-bert-base-uncased)
- Fine-tuned on legal domain text
- 110M parameters
- Validation F1: 91.84%
DeBERTa-v3-Base (microsoft/deberta-v3-base)
- Advanced disentangled attention mechanism
- 184M parameters
- Validation F1: 91.71%
Ensemble Method: Simple probability averaging Total Size: ~1.1 GB
π Performance Metrics
| Metric | Score |
|---|---|
| Accuracy | 97.74% |
| Macro F1 | 97.84% |
| Weighted F1 | 97.74% |
| Error Rate | 2.26% |
Per-Class Performance
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Safe/Standard | 98.39% | 95.31% | 96.83% | 64 |
| Unilateral Termination | 97.78% | 100.00% | 98.88% | 44 |
| Unlimited Liability | 94.29% | 97.06% | 95.65% | 34 |
| Non-Compete | 100.00% | 100.00% | 100.00% | 35 |
Confusion Matrix
text Predicted Safe Unilat Unlim NonComp Actual Safe 61 1 2 0 Unilat 0 44 0 0 Unlim 1 0 33 0 NonComp 0 0 0 35
text
π·οΈ Classification Categories
| Label | Category | Description | Risk Level |
|---|---|---|---|
| 0 | Safe/Standard | Standard legal clauses with reasonable, balanced terms | π’ Low |
| 1 | Unilateral Termination | Clauses allowing one-sided contract termination without cause | π‘ Medium |
| 2 | Unlimited Liability | Clauses with uncapped liability exposure | π΄ High |
| 3 | Non-Compete | Restrictive non-compete agreements limiting future employment | π Medium-High |
π Quick Start
Installation
pip install transformers torch numpy
Usage
import sys import os
Add model directory to path sys.path.insert(0, "path/to/model/directory")
from ensemble_model import SimpleLegalEnsemble
Load ensemble ensemble = SimpleLegalEnsemble( model_dir=".", # Current directory device='auto' # Automatically use CUDA if available )
Single prediction clause = "The Company shall be liable for all damages without any limitation whatsoever." result = ensemble.predict(clause)
print(f"Category: {result['label']}") print(f"Confidence: {result['confidence']:.2%}") print(f"All Scores: {result['all_scores']}")
Output: { 'label': 'Unlimited Liability', 'label_id': 2, 'confidence': 0.9825, 'all_scores': { 'Safe/Standard': 0.0045, 'Unilateral Termination': 0.0089, 'Unlimited Liability': 0.9825, 'Non-Compete': 0.0041 }, 'individual_models': { 'legal_bert': { 'prediction': 'Unlimited Liability', 'confidence': 0.9756 }, 'deberta': { 'prediction': 'Unlimited Liability', 'confidence': 0.9894 } } }
Batch Prediction
clauses = [ "Liability is limited to $100,000.", "Either party may terminate at any time.", "Company accepts unlimited liability.", "Employee shall not compete for 2 years." ]
results = ensemble.predict_batch(clauses, batch_size=8, show_progress=True)
for clause, result in zip(clauses, results): print(f"{clause[:50]}... β {result['label']} ({result['confidence']:.2%})")
π Repository Structure
. βββ ensemble_model.py # Main ensemble class βββ model_metadata.json # Model configuration and metrics βββ README.md # This file βββ requirements.txt # Python dependencies βββ example_usage.py # Usage examples βββ legal_bert_base/ # Legal-BERT model files β βββ config.json β βββ model.safetensors # 440 MB β βββ tokenizer files βββ deberta_v3/ # DeBERTa model files βββ config.json βββ model.safetensors # 371 MB βββ tokenizer files
π§ Training Details
Dataset
- Training Samples: 1,398 (with augmentation)
- Validation Samples: 177
- Original Samples: 827
- Augmentation Techniques:
- Synonym replacement
- Contextual word substitution
- Back-translation
- Random deletion
- Random word swapping
- Sentence shuffling
Training Configuration
- Loss Function: Focal Loss + Label Smoothing (0.1)
- Optimizer: AdamW
- Learning Rate: 1.18e-5
- Batch Size: 8
- Epochs: 15 (with early stopping)
- Warmup Ratio: 0.109
- Weight Decay: 0.0086
- Dropout: 0.173
Hardware
- GPU: NVIDIA Tesla T4/V100
- Training Time: ~2 hours (all models)
- Inference Speed: ~12 samples/second (batch size 8)
π‘ Use Cases
Contract Review Automation
- Automatically flag risky clauses in vendor contracts
- Prioritize contracts for legal review
Due Diligence
- Rapid analysis of large contract volumes during M&A
- Risk assessment for contract portfolios
Legal Tech Applications
- Contract management platforms
- Legal research tools
- Compliance monitoring systems
Educational Tools
- Teaching contract law principles
- Training paralegals and legal assistants
β οΈ Limitations
- Domain Specificity: Trained on English legal contracts; may not generalize to other languages or legal systems
- Edge Cases: Performance may vary on highly specialized or ambiguous clauses
- Context Length: Limited to 512 tokens (~300-400 words per clause)
- Not Legal Advice: This model is a tool for analysis, not a replacement for professional legal review
π Citation
@software{legal_contract_ensemble_2025, title = {Legal Contract Ensemble Classifier}, author = {Nikhil-AI-Labs}, year = {2025}, version = {1.0.0}, url = {https://huggingface.co/Nikhil-AI-Labs/legal-contract-classifier-best}, note = {97.74% accuracy ensemble model for contract clause classification} }
π License
Apache 2.0 License - See LICENSE file for details
π Acknowledgments
- Base Models:
- Frameworks: Hugging Face Transformers, PyTorch
π§ Contact
For questions, issues, or collaboration:
- Hugging Face: @Nikhil-AI-Labs
- Repository Issues: Open an issue
Developed with β€οΈ for the legal AI community
π€ Model β’ π Performance β’ π Docs
Spaces using Nikhil-AI-Labs/legal-contract-classifier-best 2
Evaluation results
- Accuracyself-reported0.977
- Macro F1self-reported0.978