
Speech-To-Action - Turning Voice Into CLI Pipelines with Whisper.cpp
I recently built a small tool called Forge STA (Speech-To-Action) . The idea is simple: Speech should not just become text. Speech should become actions . So instead of: Speech → Text we get: Speech → Text → CLI → Anything The Problem Most speech tools stop at transcription. You dictate something and it becomes text. But developers often want something different: turn speech into code comments generate prompts trigger scripts pipe into tools format into HTML / Markdown / XML In short: Speech should be part of a toolchain . The Architecture Forge STA uses a very simple design. Mic ↓ Speech Recognition ↓ Post Processing (CLI) ↓ Output / Action The interesting part is that post-processing is external . No plugins. No internal scripting. Just CLI tools . Example: STA → python formatter.py STA → bash script.sh STA → custom binary The tool receives text via stdin and returns processed output via stdout . This keeps the core small and stable . Running Whisper on a Separate Machine The latest
Continue reading on Dev.to
Opens in a new tab




