
Sleep-time compute for personal data: what your AI should do while you sleep
Your personal AI assistant sits idle most of the day. You send it a message, it responds, then it waits. For hours. Maybe all night. The compute is there — the model is loaded, the database is running, the server is warm. But nothing happens until you type the next message. That's test-time compute: work done when the user asks for it. Letta's research (arXiv:2504.13171) showed that shifting processing to idle periods — sleep-time compute — achieves 5× fewer tokens at test time and 15% more correct answers. But their implementation only processes conversation memory. Nobody had applied it to structured personal data. We did. The idea Instead of the agent doing all its thinking when you ask a question, it does most of the thinking in the background — during idle periods when you're not using the system. When you finally ask "what's going on with Project Tempest?", the answer is already half-assembled. The system maintains four background jobs that run every 30 minutes, but only when you
Continue reading on Dev.to
Opens in a new tab



