
I Wired Up 9 Gemini Models to Make Stories Feel Like Movies. Here's What Happened.
This article was created for the purposes of entering the Gemini Live Agent Challenge hackathon. I got tired of AI storytelling being a text box. You type something, you get paragraphs back, and that's it. No pictures. No voices. No music. Just words on a screen. So I spent three weeks building something different. OmniWeave takes a single prompt and turns it into a full cinematic experience -- illustrated scenes, multi-voice narration, ambient background music, even video clips. You can also just talk to it in real time through your mic and it narrates back while generating images on the fly. Try it live | Source on GitHub How It Actually Works The backend runs on Cloud Run using Google's Agent Development Kit (ADK) for TypeScript. There are 9 Gemini models doing different jobs simultaneously: What it does Model How fast Writes the story script gemini-2.5-flash ~6 seconds Reviews it for consistency gemini-2.5-flash-lite ~1 second Generates illustrations gemini-3.1-flash-image-preview
Continue reading on Dev.to
Opens in a new tab



