Back to articles
The model looked great on validation until one real invoice broke four assumptions

The model looked great on validation until one real invoice broke four assumptions

via Dev.toangu10

An empirical note on what synthetic invoice data taught a Gemma fine-tune, what it hid, and how one real document exposed the gap. I fine-tuned a small Gemma model to parse Indian invoices because I wanted a path that was cheaper, more private, and easier to deploy than calling a hosted API for every document. The training metrics looked excellent. Then I ran the model on one real invoice. It got the total right, the supplier right, the address right, and still failed in four ways that would make the output unusable in a real finance workflow. That invoice was more useful than another few hundred synthetic examples. None of the headline conclusions here are new to anyone with ML experience: synthetic data has domain gap synthetic validation can be overly optimistic real data changes what you trust What felt worth documenting was the concrete shape of the failure: which fields broke first which assumptions in the synthetic distribution caused it what the training curves looked like befo

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles