
Building Iris: A Real-Time Spatial Awareness Agent with the Gemini Live API
Created for the Gemini Live Agent Challenge #GeminiLiveAgentChallenge What is Iris? Iris is a real-time spatial awareness agent that sees through your camera and talks to you. Point your device at anything — a room, a street, a workspace — and Iris describes what it sees, warns you about obstacles, reads signs, and identifies people and their gestures. All through voice, hands-free. It's not just an accessibility tool. Iris is built for anyone who needs an extra pair of eyes — a warehouse worker navigating a crowded floor, a cyclist wanting awareness of their blind spot, a remote worker showing their setup to a colleague, or a visually impaired person walking through an unfamiliar building. The camera becomes a conversation partner. Why This Matters We interact with AI through text boxes. We type, we wait, we read. But spatial awareness doesn't work in turns — the real world moves continuously, and the information you need is often urgent. "There's a step ahead of you" is useless five
Continue reading on Dev.to
Opens in a new tab



