
NewsDevOps
Docker Model Runner Brings vLLM to macOS with Apple Silicon
via Docker BlogYiwen Xu
vLLM has quickly become the go-to inference engine for developers who need high-throughput LLM serving. We brought vLLM to Docker Model Runner for NVIDIA GPUs on Linux, then extended it to Windows via WSL2. That changes today. Docker Model Runner now supports vllm-metal, a new backend that brings vLLM inference to macOS using Apple Silicon's...
Continue reading on Docker Blog
Opens in a new tab
40 views

