
SoyLM: Building a Zero-Dependency Local RAG Tool in a Single Python File
SoyLM started as a simple idea: build a RAG (Retrieval-Augmented Generation) tool that runs entirely on your machine. No cloud APIs. No vector databases. No Docker containers. Just one Python file. Then someone pointed out that my README said "NotebookLM compatible" when it had nothing to do with NotebookLM. Which led to a documentation rewrite. Which led to removing Gemini API dependencies I'd forgotten about. Which led to rethinking the entire project identity. Building the tool took a weekend. Figuring out what it actually is took much longer. What SoyLM Actually Does SoyLM lets you upload documents (PDFs, text files, URLs, YouTube videos), then have a conversation about them with a local LLM. Behind the scenes: Source ingestion : Documents are chunked, indexed in SQLite FTS5, and pre-analyzed by the LLM Query processing : Your question triggers BM25 search to find relevant chunks Response generation : The LLM receives your question + relevant chunks and generates a grounded respons
Continue reading on Dev.to Python
Opens in a new tab




