Self-Host ACE-Step 1.5: Replace Suno With Free Local AI Music

Suno charges $30/month for its Premier plan. Udio has similar pricing. Both keep your prompts, generated audio, and usage patterns on their servers. If you want to generate AI music without the subscription fees or privacy tradeoffs, ACE-Step 1.5 runs entirely on your own machine.

Released in late January 2026 by ACE Studio and StepFun, ACE-Step 1.5 is an open-source music generation model that matches or exceeds commercial alternatives on standard benchmarks. It generates full songs in under 10 seconds on consumer GPUs, needs less than 4GB of VRAM for basic operation, and supports 50+ languages.

What You’ll Need

Minimum requirements:

4GB VRAM (GPU with CUDA, Apple Silicon, AMD ROCm, or Intel XPU)
8GB system RAM
Python 3.11 or 3.12
~10GB disk space for models

Recommended for best results:

8-16GB VRAM for better quality
16GB+ VRAM for the full 4B language model
SSD for faster model loading

ACE-Step runs on Windows, macOS, and Linux. Apple Silicon Macs use the MLX backend, AMD cards use ROCm, and Intel GPUs use XPU drivers.

Quick Installation (3 Minutes)

The project uses uv for dependency management, which handles everything automatically.

Step 1: Install uv

On macOS or Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

On Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Step 2: Clone and install

git clone https://github.com/ACE-Step/ACE-Step-1.5.git
cd ACE-Step-1.5
uv sync

Step 3: Launch

uv run acestep

Open http://localhost:7860 in your browser. Models download automatically on first run (~3-8GB depending on your VRAM tier).

Alternative: Portable Packages

If you don’t want to deal with command lines, ACE-Step offers portable packages for Windows and macOS with pre-installed dependencies. Download from the releases page, extract, and run the startup script.

Using ACE-Step

The Gradio interface has two main modes: Generation and Editing.

Generating New Songs

Enter a text prompt describing the song style, mood, and genre
Add lyrics if you want vocals (the model supports 50+ languages)
Set duration (up to 10 minutes)
Click Generate

Example prompt:

Ambient electronic, dreamy synthesizers, 120 BPM,
minor key, layered pads, subtle percussion

The interface auto-selects the optimal model configuration for your GPU. On an RTX 3090, expect generation in about 8-10 seconds. On an RTX 4090, it’s closer to 5 seconds.

Editing and Variations

ACE-Step includes tools for:

Cover generation: Create variations of existing songs
Repainting: Modify specific sections while keeping the rest
Track separation: Extract vocals, drums, bass, and other stems
Vocal-to-BGM conversion: Strip vocals and generate background music

Training Custom Styles (LoRA)

If you have a specific sound you want to replicate, ACE-Step supports LoRA fine-tuning from just a few reference songs. This is useful for matching a particular artist’s style or creating consistent output for a project.

Model Tiers by VRAM

The language model component affects output quality. ACE-Step automatically selects based on your VRAM:

VRAM	LM Model	Quality
≤6GB	DiT only (no LM)	Basic
6-8GB	0.6B parameter	Good
8-16GB	0.6B-1.7B parameter	Better
16-24GB	1.7B parameter	Very good
≥24GB	4B parameter	Optimal

Even the smallest tier produces usable music. The larger models improve lyric adherence, style consistency, and overall coherence.

How It Compares to Suno

On SongEval benchmarks, ACE-Step 1.5 outperforms Suno v5 on overall music quality scores. Independent human evaluations place it between Suno v4.5 and v5 in subjective quality.

Where Suno still leads:

Slightly better style and lyric alignment
More polished web interface
No setup required

Where ACE-Step wins:

Free forever (MIT license)
Complete privacy—nothing leaves your machine
Unlimited generations
No watermarks or usage restrictions
Full commercial rights
LoRA customization

At Suno’s Premier pricing ($30/month), you’d spend $360/year for 2,000 songs per month with usage caps. ACE-Step costs $0 after your electricity bill.

API Access for Automation

For programmatic use, start the REST API:

uv run acestep-api

The API runs at http://localhost:8001 with endpoints for generation, editing, and batch processing. You can generate up to 8 songs simultaneously if your VRAM allows.

Privacy Considerations

Running locally means:

No prompts sent to external servers
No generated audio stored in the cloud
No usage analytics or tracking
No training on your outputs

If you’re creating music for commercial projects or simply prefer not sharing your creative process with AI companies, this matters.

Troubleshooting

“CUDA out of memory”: Lower the batch size or switch to a smaller LM model tier in the settings.

Slow generation on AMD: ROCm support varies by card. Check the compatibility list for known issues.

macOS MLX errors: Make sure you’re on macOS 14+ with Apple Silicon. Intel Macs aren’t supported.

Models not downloading: Check your internet connection and disk space. Models range from 3-8GB.

What This Means

Commercial AI music services solved a genuine problem—making music generation accessible without technical expertise. But they also introduced subscription fatigue and privacy concerns that weren’t inherent to the technology.

ACE-Step 1.5 demonstrates that commercial-grade music generation can run locally on hardware many creators already own. The 4GB VRAM floor puts it within reach of most discrete GPUs from the last five years.

The tradeoff is setup complexity. Suno’s advantage is that you don’t need to install anything. If you’re comfortable with command lines and have a capable GPU, that tradeoff likely favors ACE-Step.

What You Can Do

Check if your GPU meets the minimum 4GB VRAM requirement
Install using the quick setup above
Start with simple prompts to understand the model’s capabilities
Explore LoRA training if you need consistent output styles
Consider the musician’s guide for production-quality tips

For more self-hosting guides that replace paid cloud services, check our local-ai category.