Atlas 1: Molecular Property Prediction
State-of-the-Art Geometric GNN for Predicting Molecular Properties
Part of the Atlas Series: From Properties to Structures
Model Description
Atlas 1 is a production-ready Geometric Graph Neural Network that predicts molecular properties (HOMO energy) from 3D molecular structures. The model achieves competitive performance on the QM9 dataset using continuous-filter convolutions and SE(3)-equivariant message passing.
Key Features:
- ๐ฏ RMSE: 0.109 eV on QM9 HOMO prediction
- โก Fast inference: <10ms per molecule
- ๐ฌ 2.2ร improvement over baseline
- ๐จ Interactive demo with 3D visualization
Model Details
Architecture
- Type: Geometric Graph Neural Network
- Components:
- Continuous-Filter Convolutions (CFConv)
- SE(3)-equivariant message passing
- Spherical harmonics (L=3)
- 6 interaction blocks
- Parameters: 2,519,557 (~2.5M)
- Hidden Dimension: 256
- Edge Cutoff: 5.0ร
Performance
| Metric | Value | Comparison |
|---|---|---|
| RMSE | 0.109 eV | 2.2ร better than baseline |
| MAE | 0.061 eV | Top 50% of published models |
| Pearson r | 0.994 | Near-perfect correlation |
Training
- Dataset: QM9 (130,831 molecules)
- Property: HOMO energy
- Loss: MSE
- Optimizer: AdamW with warmup (10 epochs)
- Training Time: 1.4 hours (200 epochs)
- Features: EMA, gradient clipping, mixed precision
Usage
Installation
pip install torch torch-geometric e3nn rdkit
Quick Start
import torch
from huggingface_hub import hf_hub_download
# Download model checkpoint
checkpoint_path = hf_hub_download(
repo_id="Reverb/atlas-1-molecular-property-prediction",
filename="pytorch_model.bin"
)
# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')
# Initialize model (requires src/ code)
from src.models.geometric_gnn import GeometricGNN
model = GeometricGNN(
hidden_dim=256,
num_layers=6,
num_frequencies=16,
max_l=3,
cutoff=5.0,
)
# Load weights
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()
# Predict
# (requires PyG Data object with x, pos, edge_index)
output = model(data)
homo_energy = output['pred_mean'].item() * 0.043 - 0.232 # Denormalize
print(f"Predicted HOMO: {homo_energy:.4f} eV")
Interactive Demo
Try the model in the Gradio Space with:
- 2D molecular structure visualization
- Interactive 3D molecule viewer
- Real-time HOMO predictions
- Example molecules (benzene, caffeine, etc.)
Training Details
Dataset
- Source: QM9 dataset
- Size: 130,831 organic molecules
- Split: 80% train, 10% validation, 10% test
- Property: HOMO energy (Highest Occupied Molecular Orbital)
- Preprocessing: Distance-based edges (5.0ร cutoff)
Hyperparameters
{
"hidden_dim": 256,
"num_layers": 6,
"num_frequencies": 16,
"max_l": 3,
"cutoff": 5.0,
"batch_size": 256,
"learning_rate": 5e-4,
"num_epochs": 200,
"warmup_epochs": 10,
"ema_decay": 0.999,
"gradient_clip": 10.0
}
Training Stability
- Learning rate warmup (10 epochs)
- Exponential Moving Average (EMA)
- Gradient clipping (max_norm=10.0)
- Mixed precision training (FP16)
- NaN/Inf detection and handling
Evaluation
Test Set Results
RMSE: 0.109121 eV
MAE: 0.060635 eV
Pearson r: 0.994222
Comparison to SOTA
| Model | RMSE (eV) | Architecture |
|---|---|---|
| Atlas 1 | 0.109 | CFConv + SE(3) |
| Baseline | 0.240 | Simple GNN |
| SchNet | 0.041 | Continuous filters |
| PaiNN | 0.038 | Equivariant vectors |
| DimeNet++ | 0.033 | Directional MP |
Limitations
- Trained only on QM9 (small organic molecules)
- Single property prediction (HOMO energy)
- Requires 3D coordinates as input
- Performance degrades for molecules >30 atoms
Atlas Series
Atlas 1 (This Model): Molecular property prediction
- โ HOMO energy prediction
- โ RMSE: 0.109 eV
Atlas 2 (Coming Soon): 3D structure prediction
- ๐ฎ Generate 3D coordinates from SMILES
- ๐ฎ Conformer generation
Citation
@software{atlas1_2025,
title={Atlas 1: Molecular Property Prediction},
author={Reverb},
year={2025},
url={https://huggingface.co/Reverb/atlas-1-molecular-property-prediction},
note={SOTA Geometric GNN for molecular property prediction}
}
License
MIT License - see LICENSE file for details
Acknowledgments
- QM9 dataset: Ramakrishnan et al., 2014
- SchNet architecture: Schรผtt et al., 2017
- PyTorch Geometric library
- e3nn for equivariant operations
Contact
- Author: Reverb
- Hugging Face: @Reverb
- Demo: Atlas 1 Space
Part of the Atlas Project: Building a comprehensive suite of deep learning models for molecular and protein science.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support