Atlas 1: Molecular Property Prediction

State-of-the-Art Geometric GNN for Predicting Molecular Properties

Part of the Atlas Series: From Properties to Structures

Demo Performance

Model Description

Atlas 1 is a production-ready Geometric Graph Neural Network that predicts molecular properties (HOMO energy) from 3D molecular structures. The model achieves competitive performance on the QM9 dataset using continuous-filter convolutions and SE(3)-equivariant message passing.

Key Features:

  • ๐ŸŽฏ RMSE: 0.109 eV on QM9 HOMO prediction
  • โšก Fast inference: <10ms per molecule
  • ๐Ÿ”ฌ 2.2ร— improvement over baseline
  • ๐ŸŽจ Interactive demo with 3D visualization

Model Details

Architecture

  • Type: Geometric Graph Neural Network
  • Components:
    • Continuous-Filter Convolutions (CFConv)
    • SE(3)-equivariant message passing
    • Spherical harmonics (L=3)
    • 6 interaction blocks
  • Parameters: 2,519,557 (~2.5M)
  • Hidden Dimension: 256
  • Edge Cutoff: 5.0ร…

Performance

Metric Value Comparison
RMSE 0.109 eV 2.2ร— better than baseline
MAE 0.061 eV Top 50% of published models
Pearson r 0.994 Near-perfect correlation

Training

  • Dataset: QM9 (130,831 molecules)
  • Property: HOMO energy
  • Loss: MSE
  • Optimizer: AdamW with warmup (10 epochs)
  • Training Time: 1.4 hours (200 epochs)
  • Features: EMA, gradient clipping, mixed precision

Usage

Installation

pip install torch torch-geometric e3nn rdkit

Quick Start

import torch
from huggingface_hub import hf_hub_download

# Download model checkpoint
checkpoint_path = hf_hub_download(
    repo_id="Reverb/atlas-1-molecular-property-prediction",
    filename="pytorch_model.bin"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')

# Initialize model (requires src/ code)
from src.models.geometric_gnn import GeometricGNN

model = GeometricGNN(
    hidden_dim=256,
    num_layers=6,
    num_frequencies=16,
    max_l=3,
    cutoff=5.0,
)

# Load weights
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# Predict
# (requires PyG Data object with x, pos, edge_index)
output = model(data)
homo_energy = output['pred_mean'].item() * 0.043 - 0.232  # Denormalize
print(f"Predicted HOMO: {homo_energy:.4f} eV")

Interactive Demo

Try the model in the Gradio Space with:

  • 2D molecular structure visualization
  • Interactive 3D molecule viewer
  • Real-time HOMO predictions
  • Example molecules (benzene, caffeine, etc.)

Training Details

Dataset

  • Source: QM9 dataset
  • Size: 130,831 organic molecules
  • Split: 80% train, 10% validation, 10% test
  • Property: HOMO energy (Highest Occupied Molecular Orbital)
  • Preprocessing: Distance-based edges (5.0ร… cutoff)

Hyperparameters

{
    "hidden_dim": 256,
    "num_layers": 6,
    "num_frequencies": 16,
    "max_l": 3,
    "cutoff": 5.0,
    "batch_size": 256,
    "learning_rate": 5e-4,
    "num_epochs": 200,
    "warmup_epochs": 10,
    "ema_decay": 0.999,
    "gradient_clip": 10.0
}

Training Stability

  • Learning rate warmup (10 epochs)
  • Exponential Moving Average (EMA)
  • Gradient clipping (max_norm=10.0)
  • Mixed precision training (FP16)
  • NaN/Inf detection and handling

Evaluation

Test Set Results

RMSE: 0.109121 eV
MAE:  0.060635 eV
Pearson r: 0.994222

Comparison to SOTA

Model RMSE (eV) Architecture
Atlas 1 0.109 CFConv + SE(3)
Baseline 0.240 Simple GNN
SchNet 0.041 Continuous filters
PaiNN 0.038 Equivariant vectors
DimeNet++ 0.033 Directional MP

Limitations

  • Trained only on QM9 (small organic molecules)
  • Single property prediction (HOMO energy)
  • Requires 3D coordinates as input
  • Performance degrades for molecules >30 atoms

Atlas Series

Atlas 1 (This Model): Molecular property prediction

  • โœ… HOMO energy prediction
  • โœ… RMSE: 0.109 eV

Atlas 2 (Coming Soon): 3D structure prediction

  • ๐Ÿ”ฎ Generate 3D coordinates from SMILES
  • ๐Ÿ”ฎ Conformer generation

Citation

@software{atlas1_2025,
  title={Atlas 1: Molecular Property Prediction},
  author={Reverb},
  year={2025},
  url={https://huggingface.co/Reverb/atlas-1-molecular-property-prediction},
  note={SOTA Geometric GNN for molecular property prediction}
}

License

MIT License - see LICENSE file for details

Acknowledgments

  • QM9 dataset: Ramakrishnan et al., 2014
  • SchNet architecture: Schรผtt et al., 2017
  • PyTorch Geometric library
  • e3nn for equivariant operations

Contact


Part of the Atlas Project: Building a comprehensive suite of deep learning models for molecular and protein science.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support