Back to articles
I built a pay-per-use video transcription tool with Next.js and Whisper — here's the full breakdown

I built a pay-per-use video transcription tool with Next.js and Whisper — here's the full breakdown

via Dev.to WebdevFilipi Youssef

Why I built this I kept running into the same frustration: I had a video file and I needed the text inside it. The available options were either too manual (typing it yourself), too unreliable (YouTube auto-captions), or too expensive for occasional use (most transcription SaaS tools charge a flat monthly fee regardless of how much you actually transcribe). So I built Tonivox — a web app that accepts a video file, extracts the audio, runs it through a transcription model, and returns the full text. Pay per transcription, no subscription. This post covers the technical decisions, the problems I ran into, and what I'd do differently. The stack Next.js 15 (App Router) — frontend and API routes Prisma + PostgreSQL — data layer Better Auth — authentication (email/password + email verification) Stripe — credit purchases via Checkout Sessions OpenAI Whisper — transcription model FFmpeg — audio extraction from video files Tailwind CSS — styling How it works The flow is straightforward: User up

Continue reading on Dev.to Webdev

Opens in a new tab

Read Full Article
2 views

Related Articles