Back to articles
I tried parsing emails with regex. It went exactly how you think.

I tried parsing emails with regex. It went exactly how you think.

via Dev.toNikola Mitrovic

Recently I needed to process incoming emails automatically . The idea sounded simple: Email arrives → extract some fields → trigger a webhook Things like: order confirmations invoice emails shipping notifications support messages Nothing complicated. Or so I thought. Attempt #1 — Regex Like most developers, I started with regex. const price = email.match(/Total:\s\$(\d+)/) For the first email it worked perfectly. Then the next email came in and said: Amount paid: $29 Then another one said: Total price: USD 29 Then an HTML email arrived with nested tables, inline styles, and formatting from what looked like 2004 Outlook templates . At this point my regex slowly evolved into something like this: /(Total|Amount|Price).*?(\$|USD)?\s?(\d+(\.\d+)?)/ Which is usually the moment you realize the approach is already doomed. Attempt #2 — Parsing the HTML Okay fine. Let's parse the HTML instead. That led to code like this: const dom = new JSDOM(emailHtml) Which sometimes worked. Except email HTML

Continue reading on Dev.to

Opens in a new tab

Read Full Article
4 views

Related Articles