
How I bypassed PyTorch OOM errors with a Zero-Copy C++ Graph Engine
If you have ever tried to train a Graph Neural Network (GNN) on a massive dataset, you already know the pain of the "Memory Wall." Loading a dataset like Papers100M into PyTorch Geometric almost always ends the exact same way on a standard machine: an instant 24GB+ Out-Of-Memory (OOM) allocation crash. Standard libraries try to load the entire edge list and feature matrix into RAM before moving it to the GPU. I got tired of my laptop crashing, so I built GraphZero (v0.2.0): a custom C++ data engine that bypasses system RAM entirely and streams datasets natively from the SSD. Here is how I built a zero-copy pipeline that lets PyTorch train on 30GB of data while allocating 0 bytes of RAM. 🧠 The Architecture: mmap and Zero-Copy The core philosophy of GraphZero is simple: let the Operating System do the heavy lifting. Instead of parsing CSVs into Python lists or Pandas DataFrames, GraphZero compiles raw data into two heavily optimized binary formats: .gl files: Stores the graph topology (e
Continue reading on Dev.to Python
Opens in a new tab


