
Cloudflare Workers AI Has a Free API: Run AI Models at the Edge with Zero Infrastructure
What is Workers AI? Workers AI lets you run AI models on Cloudflare's edge network — text generation, image classification, embeddings, speech-to-text, translation, and more. No GPU provisioning, no model hosting. Free tier: 10,000 neurons/day (enough for ~100+ requests). Quick Start npm create cloudflare@latest my-ai-app -- --template worker-typescript cd my-ai-app # wrangler.toml [ai] binding = "AI" Text Generation (LLM) export default { async fetch ( request : Request , env : Env ) { const response = await env . AI . run ( " @cf/meta/llama-3.1-8b-instruct " , { messages : [ { role : " system " , content : " You are a helpful assistant. " }, { role : " user " , content : " Explain WebAssembly in 3 sentences. " }, ], max_tokens : 256 , }); return Response . json ( response ); }, }; Streaming Responses const stream = await env . AI . run ( " @cf/meta/llama-3.1-8b-instruct " , { messages : [{ role : " user " , content : " Write a poem about coding " }], stream : true , }); return new Re
Continue reading on Dev.to Tutorial
Opens in a new tab



