Local AI on Windows (v3.2)

Run Hedy's AI fully on your Windows PC.

Powered by llama.cpp with Vulkan GPU acceleration and hybrid GPU plus CPU offload, so models that exceed your VRAM still run, just slower
A live "Memory available for AI" indicator shows how each model fits your hardware before you download
Choose from compact 2 GB models for smaller systems up to 20 GB+ models for high-end GPUs
Models that need to fall back to CPU are clearly marked with a "+ Slow" badge
Your conversation data stays on your machine for anything a local model handles, no round-trip to the cloud

Currently in active development, expect quality and model selection to keep improving.