
Ollama Has a Free API — Run LLMs Locally Without OpenAI or Cloud Costs
Run GPT-Level Models on Your Laptop Ollama lets you run open-source LLMs (Llama 3, Mistral, Gemma, Phi) locally with a simple API. Free forever — no API keys, no rate limits, no cloud. Setup # Install curl -fsSL https://ollama.com/install.sh | sh # Pull a model ollama pull llama3.2 # 2GB, runs on most laptops ollama pull mistral # 4GB, great for coding ollama pull phi3 # 1.7GB, fastest API (OpenAI-Compatible) import requests def chat ( prompt , model = " llama3.2 " ): r = requests . post ( " http://localhost:11434/api/generate " , json = { " model " : model , " prompt " : prompt , " stream " : False }) return r . json ()[ " response " ] print ( chat ( " Write a Python function to check if a number is prime " )) Chat Conversations def chat_conversation ( messages , model = " llama3.2 " ): r = requests . post ( " http://localhost:11434/api/chat " , json = { " model " : model , " messages " : messages , " stream " : False }) return r . json ()[ " message " ][ " content " ] response = chat
Continue reading on Dev.to Python
Opens in a new tab




