FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources
  • Privacy Policy

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
Best LLMs for Ollama on 16GB VRAM GPU
NewsMachine Learning

Best LLMs for Ollama on 16GB VRAM GPU

via Dev.toRost1mo ago

Running large language models locally gives you privacy, offline capability, and zero API costs. This benchmark reveals exactly what one can expect from 9 popular LLMs on Ollama on an RTX 4080 . With a 16GB VRAM GPU, I faced a constant trade-off: bigger models with potentially better quality, or smaller models with faster inference . TL;DR Here is the comparison table of LLM performance on RTX 4080 16GB with Ollama 0.15.2: Model RAM+VRAM Used CPU/GPU Split Tokens/sec gpt-oss:20b 14 GB 100% GPU 139.93 ministral-3:14b 13 GB 100% GPU 70.13 qwen3:14b 12 GB 100% GPU 61.85 qwen3-vl:30b-a3b 22 GB 30%/70% 50.99 glm-4.7-flash 21 GB 27%/73% 33.86 nemotron-3-nano:30b 25 GB 38%/62% 32.77 devstral-small-2:24b 19 GB 18%/82% 18.67 mistral-small3.2:24b 19 GB 18%/82% 18.51 gpt-oss:120b 66 GB 78%/22% 12.64 Key insight : Models that fit entirely in VRAM are dramatically faster. GPT-OSS 20B achieves 139.93 tokens/sec, while GPT-OSS 120B with heavy CPU offloading crawls at 12.64 tokens/sec—an 11x speed dif

Continue reading on Dev.to

Opens in a new tab

Read Full Article
26 views

Related Articles

5 gadgets I'm buying this spring to grow my green thumb (and they're still discounted)
News

5 gadgets I'm buying this spring to grow my green thumb (and they're still discounted)

ZDNet • 8h ago

The Graph Problems You’re Already Solving (Just Badly)
News

The Graph Problems You’re Already Solving (Just Badly)

Medium Programming • 8h ago

If-Else Is Killing Your Code — Here’s What Senior Developers Do Differently
News

If-Else Is Killing Your Code — Here’s What Senior Developers Do Differently

Medium Programming • 9h ago

Why Software Gets Harder to Change Long Before It Breaks
News

Why Software Gets Harder to Change Long Before It Breaks

Medium Programming • 9h ago

These 7 wellness gadgets helped me become more mindful (and they're still on sale)
News

These 7 wellness gadgets helped me become more mindful (and they're still on sale)

ZDNet • 9h ago

Discover More Articles