Open Dungeon Enables Local Roleplay Under 8GB RAM

LocalAI AI-Agents Research

TL;DR: An open-source local roleplay project combines Gemma 4 QAT and FLUX to run a 256K context narrator and image generator entirely on consumer hardware.

Summary: Open Dungeon provides a fully offline, private alternative to AI Dungeon. It uses Gemma 4 (12B QAT Q4) via Ollama for story generation and FLUX for inline image generation. By leveraging Gemma 4's highly optimized KV cache growth, the system runs the 12B model at its full 256K context within approximately 7.7GB of RAM.

Why it matters: It demonstrates the viability of running state-of-the-art long-context LLMs and diffusion models concurrently on low-end local hardware. Developers can adapt these memory-saving configurations for resource-constrained edge computing and private agent setups.

Source: r/localllama