
Three API Calls That Make Your LLM Workflow Dramatically Better
Building on top of LLMs is mostly a problem of plumbing. The model itself is capable. The hard part is everything around it: managing context, crafting prompts that reliably produce good output, and dealing with all the unstructured text that needs to become structured data. Three tools in my stack handle the most common versions of these problems. Here's how I use them. Problem 1: Context Overflows — Token Counter If you're building any agent that processes documents, maintains conversation history, or chains multiple LLM calls, you've hit this error: openai.BadRequestError: This model's maximum context length is 128000 tokens. However, your messages resulted in 134521 tokens. The naive fix is to truncate blindly. The real fix is to count tokens before you make the call and make a decision. The problem is token counting is model-specific, non-obvious, and the libraries that do it ( tiktoken , @dqbd/tiktoken ) have quirks around special tokens, system prompts, and multi-turn conversati
Continue reading on Dev.to
Opens in a new tab


