
We Started With an Image Editor. We Ended Up Building Infrastructure
Last year, we started building an object-based image editor. The idea was ambitious: Let users work with images as layered compositions instead of flat outputs. Move objects around, replace parts of a scene, generate edits with actual control. Something closer to creative work than typing a prompt and crossing your fingers. Sounded hard, but we had a few promising ideas. What we didn't expect: the editor wasn't the hard part. Everything underneath it was. Once we got deeper in, we realized building on top of image models is way messier than it looks from the outside. Not because the models are bad — because the infrastructure around them is fragmented in ways that hurt the moment you try to ship something real. "Just send a prompt, get an image back". Sure, in a demo. In practice, one workflow might chain four or five different steps. Text-to-image generation. Inpainting. Object segmentation. Upscaling. Maybe a style transfer pass. And the moment you pick the best model for each step,
Continue reading on Dev.to
Opens in a new tab



