
How I Run 6 AI Services Simultaneously on RTX 5090 + WSL2 + Docker (And You Can Too)
TL;DR : I built a multi-service local AI stack (image gen, video gen, voice synthesis, voice cloning) running on RTX 5090 via WSL2 Docker. The key breakthrough was solving the GPU driver passthrough layer that nobody documented. Here's the architecture, the critical gpu-run function, and everything I learned the hard way. The Problem Nobody Solved In August 2025, I bought an RTX 5090. Blackwell architecture. 32GB GDDR7. Compute capability sm_120 . And nobody could make it work with WSL2 + Docker + PyTorch. The issue wasn't any single component. nvidia-smi worked fine in containers. libcuda.so.1 loaded correctly. But PyTorch kept returning torch.cuda.is_available() = False with a cryptic Error 500: named symbol not found . I spent roughly 40 hours debugging. Here's what I found, and how I turned it into a production multi-service AI environment. The Root Cause The failure point was in the interaction layer between WSL2's driver mounting and Docker's GPU runtime. When you run --gpus all
Continue reading on Dev.to
Opens in a new tab


