
Benchmarking Polars, DuckDB & Dask for RADIS: My GSoC 2026 Proposal Deep Dive
I've spent the last few months diving deep into RADIS — one of the fastest open-source line-by-line spectroscopic codes available. RADIS can simulate high-resolution infrared spectra of molecules like CO₂, H₂O, and CH₄, and it's used by researchers studying combustion diagnostics, exoplanet atmospheres, and plasma physics. But while contributing to the codebase, I discovered a critical problem hiding underneath the performance — one that could eventually break RADIS for large databases entirely. This post is my deep dive into that problem, the solution I'm proposing for GSoC 2026, and the technical work I've already done to prove it works. The Problem: RADIS is Sitting on a Time Bomb 1. Vaex is Unmaintained RADIS currently uses Vaex for lazy loading of large spectroscopic databases. Vaex is a brilliant library — it uses memory mapping and zero-copy lazy computations to handle datasets that don't fit in RAM. But here's the uncomfortable truth: Vaex is no longer actively maintained. The
Continue reading on Dev.to Python
Opens in a new tab



