
Operating in Prompt Space: Red Teaming the Control Plane of an LLM
Before this post existed, it was a prompt. Before that, a response to a prompt. Before that, a reframing of a response. Somewhere between the fourth and sixth model pass (different systems, different temperatures, different instructions) the actual argument started to emerge. Not because any single model figured it out. Because the loop was allowed to run. What you're reading was shaped by the thing it's analyzing. It moved through prompt space before it got here. I don't think that's a disclaimer. I think that's the first data point. This is not metaphorical. What I Mean by Prompt Space The way I think about it: prompt space is the entire input domain of a language model. Every piece of text it can receive and act on. Not a metaphor for "how you phrase things." The actual execution environment. When I send a prompt, I'm operating in it. When someone crafts an injection, they're operating in it. When a model reasons about its own instructions, it's operating in it. From the model's int
Continue reading on Dev.to Webdev
Opens in a new tab



