A Serverless Blueprint for Multimodal Video Search on AWS

Originally published on Build With AWS . Subscribe for weekly AWS builds. This design was inspired by Miguel Otero Pedrido and Alex Razvant’s “Kubrick” course, but rebuilt using native AWS primitives instead of custom frameworks. Video is impossible to search. You can scrub through it manually, or rely on YouTube’s auto-generated captions that only match exact keywords. But what if you want to find “the outdoor mountain scene” or “where they discuss AI ethics”? Traditional video platforms fail here because they treat video as a single data type. This system treats video as three parallel search problems. Speech gets transcribed with word-level timestamps and indexed for semantic search. Every frame generates a semantic description through Claude Vision and goes into a separate index. Those same frames become 1,024-dimensional vectors for visual similarity search. Users ask questions in natural language, and an intelligent agent figures out which index to query. Results come back with e

A Serverless Blueprint for Multimodal Video Search on AWS

Related Articles

I Quit Coding Tutorials for 30 Days — And Finally Escaped Tutorial Hell

Xperience Community: Content Repositories

Build Pipeline Executors Using Generator Functions

Designing Game Economies: Why Spreadsheets Eventually Break

How to use Jinja2 Templates

Related Articles

How-To
I Quit Coding Tutorials for 30 Days — And Finally Escaped Tutorial Hell
Medium Programming • 54m ago

How-To
Xperience Community: Content Repositories
Dev.to • 1h ago

How-To
Build Pipeline Executors Using Generator Functions
Medium Programming • 1h ago

How-To
Designing Game Economies: Why Spreadsheets Eventually Break
Dev.to • 1h ago

How-To
How to use Jinja2 Templates
Dev.to Tutorial • 1h ago