
Parsing bank transaction strings is way harder than you think
Look at this string: POS 1234 AMZN Mktp NL 29.99 EUR Seems easy to parse, right? You see "AMZN," you think Amazon, done. Now try extracting all of this: The actual merchant name What category the purchase belongs to Where the purchase happened Whether a payment processor was involved How confident you are in all of the above Now do it for these too: UBER BV HELP.UBER.COM NL CRV*UBER EATS 123456 SPOTIFY P1234 STOCKHOLM SE SQ *VERVE COFFEE ROASTERS SAN FRAN TST* RESTAURANT DE HAVEN AMSTERDAM APPLE PAY *SQ *BLUE BOTTLE PAYPAL #12367121, Milhouse Hostel, Buenos Aires Every single one of those follows different conventions. Different banks, different countries, different payment processors, different abbreviations. No standard format. No shared merchant IDs. No consistency whatsoever. I've spent the better part of two years working on this problem, and in this post I want to share what I learned about why it's so deceptively difficult and what approaches actually work. Why bank transactions
Continue reading on Dev.to Webdev
Opens in a new tab



