
I built a validation pipeline that blocks AI-generated files from reaching disk if they fail schema checks
The problem I've been using local LLMs to generate structured Markdown knowledge files — architecture docs, runbooks, API references. After a few hundred files, the knowledge base becomes noise. Wrong field types. Invalid enum values. Dates in the wrong format. Domains that don't exist in the taxonomy. Dataview queries return nothing. The graph becomes useless. The issue isn't the model. It's that there's no contract between "LLM output" and "file that reaches disk." The solution: a validation gate AKF sits between the LLM and the filesystem: Prompt → LLM → Validation Engine → Error Normalizer → Retry Controller → Commit Gate → File LLM generates a Markdown file with YAML frontmatter Validation Engine checks it — binary VALID/INVALID, typed error codes (E001–E007) If invalid, Error Normalizer translates errors into correction instructions and sends them back to the LLM Retry Controller retries up to 3 times — aborts if the same error fires twice (prevents infinite cost loops) Commit Ga
Continue reading on Dev.to Python
Opens in a new tab

