ACE-Step-Custom / README_PROJECT.md
ACE-Step Custom
Deploy ACE-Step Custom Edition with bug fixes
a602628

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: ACE-Step 1.5 Custom Edition
emoji: 🎵
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
python_version: '3.11'
hardware: zero-gpu-medium

ACE-Step 1.5 Custom Edition

A fully-featured implementation of ACE-Step 1.5 with custom GUI and workflow capabilities for local use and HuggingFace Space deployment.

Features

🎵 Three Main Interfaces

  1. Standard ACE-Step GUI: Full-featured standard ACE-Step 1.5 interface with all original capabilities
  2. Custom Timeline Workflow: Advanced timeline-based generation with:
    • 32-second clip generation (2s lead-in + 28s main + 2s lead-out)
    • Seamless clip blending for continuous music
    • Context Length slider (0-120 seconds) for style guidance
    • Master timeline with extend, inpaint, and remix capabilities
  3. LoRA Training Studio: Complete LoRA training interface with:
    • Audio file upload and preprocessing
    • Custom training configuration
    • Model download/upload for continued training

Architecture

  • Base Model: ACE-Step v1.5 Turbo
  • Framework: Gradio 5.9.1, PyTorch
  • Deployment: Local execution + HuggingFace Spaces
  • Audio Processing: DiT + VAE + 5Hz Language Model

Installation

Local Setup

# Clone the repository
git clone https://github.com/yourusername/ace-step-custom.git
cd ace-step-custom

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Download ACE-Step model
python scripts/download_model.py

# Run the application
python app.py

HuggingFace Space Deployment

  1. Create a new Space on HuggingFace
  2. Upload all files to the Space
  3. Set Space to use GPU (recommended: H200 or A100)
  4. The app will automatically download models and start

Usage

Standard Mode

Use the first tab for standard ACE-Step generation with all original features.

Timeline Mode

  1. Enter your prompt/lyrics
  2. Adjust Context Length (how far back to reference previous clips)
  3. Click "Generate" to create 32-second clips
  4. Clips automatically blend and add to timeline
  5. Use "Extend" to continue the song or other options for variations

LoRA Training

  1. Upload audio files for training
  2. Configure training parameters
  3. Train custom LoRA models
  4. Download and reuse for continued training

System Requirements

Minimum

  • GPU: 8GB VRAM (with optimizations)
  • RAM: 16GB
  • Storage: 20GB

Recommended

  • GPU: 16GB+ VRAM (A100, H200, or consumer GPUs)
  • RAM: 32GB
  • Storage: 50GB

Technical Details

  • Audio Format: 48kHz, stereo
  • Generation Speed: ~8 inference steps (turbo model)
  • Context Window: Up to 120 seconds for style guidance
  • Blend Regions: 2-second crossfade between clips

Credits

Based on ACE-Step 1.5 by ACE Studio

License

MIT License (see LICENSE file)