
The Responsible Disclosure Problem in AI Safety Research
This is a research in progress since 2024. The most important findings are the ones we can't publish People might wonder " What the big deal with AI Safety, huh ?" . Oh my poor sweet child if only you knew. And the problem is that you don't know. And that's not even your fault. I asked myself this question too. What the big deal ? I was in for a surprise when i started to dig in. the deal is YUUUGE. Aaaaand... i can't show it. All i can do is to try to convince you that it is. Deep fake ? Slightly racist interpretation of an event ? A bit of sexist content ? Not to minimize this problem, which is real, but that ain't much compared to what an unhinged unsafe AI can do. In traditional cybersecurity, responsible disclosure is a mature practice. You find a vulnerability you notify the vendor you wait for a patch then you publish. The ecosystem has CVEs, coordinated disclosure frameworks, bug bounties. It's imperfect but functional. AI safety research has none of that infrastructure. The Th
Continue reading on Dev.to
Opens in a new tab

