Run Hedy's AI fully on your Windows PC.
  • Powered by llama.cpp with Vulkan GPU acceleration and hybrid GPU plus CPU offload, so models that exceed your VRAM still run, just slower
  • A live "Memory available for AI" indicator shows how each model fits your hardware before you download
  • Choose from compact 2 GB models for smaller systems up to 20 GB+ models for high-end GPUs
  • Models that need to fall back to CPU are clearly marked with a "+ Slow" badge
  • Your conversation data stays on your machine for anything a local model handles, no round-trip to the cloud
Currently in active development, expect quality and model selection to keep improving.