
How to Extract Emails and Contacts from Any Website (Node.js)
Contact data extraction is one of the most requested scraping tasks. Here's a reliable approach. The Regex Pattern const EMAIL_REGEX = / [ a-zA-Z0-9._%+- ] +@ [ a-zA-Z0-9.- ] + \.[ a-zA-Z ]{2,} /g ; const PHONE_REGEX = / \+[ 1-9 ]\d{0,2}[\s\- . ]\(?\d{2,4}\)?[\s\- . ]?\d{3,4}[\s\- . ]?\d{3,4} | \(\d{3}\)[\s\- . ]?\d{3}[\s\- . ]?\d{4} | \b\d{3}[\- . ]\d{3}[\- . ]\d{4}\b /g ; Full Extractor const cheerio = require ( ' cheerio ' ); async function extractContacts ( url ) { const res = await fetch ( url , { headers : { ' User-Agent ' : ' ContactBot/1.0 ' } }); const html = await res . text (); const $ = cheerio . load ( html ); // Remove scripts/styles $ ( ' script, style ' ). remove (); const text = $ ( ' body ' ). text (); const emails = [... new Set ( text . match ( EMAIL_REGEX ) || [])]; const phones = [... new Set ( text . match ( PHONE_REGEX ) || [])]; // Also check mailto: links $ ( ' a[href^="mailto:"] ' ). each (( i , el ) => { const email = $ ( el ). attr ( ' href ' ). replace ( '
Continue reading on Dev.to Tutorial
Opens in a new tab



