
How to Extract Structured Contact Data from Messy Emails using AI (and Validate Italian VATs)
As developers, we’ve all been there: a client asks you to build a system to capture leads from incoming emails, WhatsApp messages, or a generic "Contact Us" text area. You expect structured data, but what you actually get from users is this: "Hi, I'm Mario Rossi from Milan. I need a quote. You can call me at 333 12 34 567. My company VAT is 12345678901. Thanks." Good luck parsing that with Regex! 😅 Phone numbers have random spaces, names are mixed with cities, and validating the VAT number usually requires writing a custom Modulo 10 algorithm. The Solution: AI + Mathematical Validation I got tired of maintaining fragile regular expressions, so I decided to build a dedicated backend using Node.js, Express, and OpenAI's GPT-4o-mini. The goal was simple: send raw text in, get a guaranteed clean JSON out. Instead of just relying on the LLM to guess if a VAT number is valid, I built a hybrid system: The AI extracts the entities (Name, Phone, City, VAT, Intent). The Node.js backend processes
Continue reading on Dev.to JavaScript
Opens in a new tab




