Scaling LLMs at the Edge: A journey through distillation, routers, and embeddings

I have extensively edited this article after an LLM agent combed through my codebase and prepared the initial draft. Originally published at Sisyphus Consulting blog At Sisyphus Consulting , We recently launched a unique product in the market: physical facilitation cards + digital tools for virtual facilitation. They're named Wu Wei Cards . But this write-up is not about the product. I want to share the behind-the-scenes events of how I navigated through tinkering with LLMs, Embeddings and the whole trial-and-error. If you're building something in AI-space, I hope this would be helpful to you. First, let me give the background so that you know the WHATs and WHYs. The Product and the Constraint Wu Wei Cards is a deck of 50 hand-drawn metaphorical cards for facilitators, coaches, and therapists. People use them in workshops. To help participants reflect, open up, and explore ideas through objects and images rather than direct questioning. This is how they look like: Anyone who buys the p

Scaling LLMs at the Edge: A journey through distillation, routers, and embeddings

Related Articles

What You Need to Know About Building an Outdoor Sauna (2026)

The Boring Skills That Make Developers Unstoppable in 2026

I Installed This VS Code Extension… and My Code Got Instantly Better

The Age of Personalized Software

Automating Checkout Add-On Recommendations in WordPress for WooCommerce

Related Articles

How-To
What You Need to Know About Building an Outdoor Sauna (2026)
Wired • 2h ago

How-To
The Boring Skills That Make Developers Unstoppable in 2026
Medium Programming • 7h ago

How-To
I Installed This VS Code Extension… and My Code Got Instantly Better
Medium Programming • 8h ago

How-To
The Age of Personalized Software
Medium Programming • 10h ago

How-To
Automating Checkout Add-On Recommendations in WordPress for WooCommerce
Dev.to • 11h ago