Back to articles
I wanted to build an Agent Memory System and blundered my way into 92% on LongMemEval

I wanted to build an Agent Memory System and blundered my way into 92% on LongMemEval

via Dev.toShane Farkas

Like most users of AI agents like Claude Code, I have been frustrated by the agent memory problem. The models have gotten extremely good and no longer lose focus in one long conversation like they used to, but across sessions the memory is pretty spotty whether it's a conversation with an LLM where it recalls imperfect or irrelevant data from previous chats, or a new Claude Code session where I feel like it’s Groundhog Day onboarding a brand new employee who’s smart and talented but knows nothing about my world. So I started looking into the various memory systems. I tried a folder of markdown files, Obsidian vaults etc. but every AI memory system I tried had the same problem: dump text into a vector store, retrieve by cosine similarity, hope for the best. It works fine for "what did we talk about last week?" but falls apart the moment you need real reasoning like when facts contradict each other, when the answer requires connecting information from three different conversations, or wh

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles