Building an AI Profanity Filter with Vocal Separation

I built an online tool that automatically detects and bleeps profanity in video and audio files. Here's the high-level architecture. The problem Manual profanity censoring takes 45+ minutes for a 10-minute video. You have to listen through, find each word, razor the audio, drop a beep effect. For songs, it's nearly impossible without destroying the music. The solution AI speech recognition + neural vocal separation. How it works User uploads a file or pastes a YouTube URL Audio is extracted with FFmpeg AI speech-to-text transcribes the audio (AssemblyAI / Deepgram) Profanity is detected using morphological analysis (lemmatization) Each word is replaced with beep/silence/custom sound via FFmpeg For songs: Demucs AI separates vocals from instruments first Song mode — the hard part Demucs by Meta AI does the heavy lifting — splitting audio into vocal and instrumental tracks. Profanity detection runs only on the vocal track, then the censored vocals are mixed back with the original instrum

Building an AI Profanity Filter with Vocal Separation

Related Articles

Bipolar and Sleep Deprivation: What Actually Happens

Learn how to develop like a pro for free

I didn't have to drill these renter-friendly smart lights into my wall - and I love them for it

How to Create and Use Checkboxes in Figma

The DSA Illusion: Why Most Data Structures Don’t Actually Exist

Related Articles

How-To
Bipolar and Sleep Deprivation: What Actually Happens
Dev.to • 10m ago

How-To
Learn how to develop like a pro for free
Medium Programming • 41m ago

How-To
I didn't have to drill these renter-friendly smart lights into my wall - and I love them for it
ZDNet • 2h ago

How-To
How to Create and Use Checkboxes in Figma
FreeCodeCamp • 2h ago

How-To
The DSA Illusion: Why Most Data Structures Don’t Actually Exist
Medium Programming • 3h ago