Token0 v0.2.0: Streaming Support + Updated Benchmarks : 35-42% Savings Across 4 Vision Models

A few days ago I launched Token0 -- an open-source API proxy that makes vision LLM calls cheaper by optimizing images before they hit the model. The response was great, so here is the first real update: v0.2.0 with full streaming support and expanded benchmarks . What's New in v0.2.0 1. Streaming support ( stream=true ) This was the most requested feature. Token0 now supports Server-Sent Events streaming across all four providers -- OpenAI, Anthropic, Google, and Ollama. How it works: Token0 optimizes your images before streaming begins, then tokens flow word-by-word exactly like native provider APIs. You get the cost savings without sacrificing the real-time UX. from openai import OpenAI client = OpenAI ( base_url = " http://localhost:8000/v1 " , api_key = " sk-... " , ) stream = client . chat . completions . create ( model = " gpt-4o " , messages = [{ " role " : " user " , " content " : [ { " type " : " text " , " text " : " Describe this image " }, { " type " : " image_url " , " ima

Token0 v0.2.0: Streaming Support + Updated Benchmarks : 35-42% Savings Across 4 Vision Models

Related Articles

Samsung Galaxy S26 and Galaxy S26+ Review: Lacking Ambition

5 kitchen splurges that I can't recommend enough

Here’s how to rank the 50 best Apple products ever

Fix Payment and Tax Issues in Museum Ticketing Software

Difficulty vs Confusion in Tactical Games

Related Articles

How-To
Samsung Galaxy S26 and Galaxy S26+ Review: Lacking Ambition
Wired • 3h ago

How-To
5 kitchen splurges that I can't recommend enough
ZDNet • 4h ago

How-To
Here’s how to rank the 50 best Apple products ever
The Verge • 4h ago

How-To
Fix Payment and Tax Issues in Museum Ticketing Software
Dev.to Beginners • 5h ago

How-To
Difficulty vs Confusion in Tactical Games
Medium Programming • 5h ago