Back to articles
How to Build a Web Scraper with Playwright in 20 Lines of Code

How to Build a Web Scraper with Playwright in 20 Lines of Code

via Dev.to TutorialTateLyman

Playwright makes web scraping trivially easy. Here's a complete scraper in 20 lines: const { chromium } = require ( " playwright " ); async function scrape ( url ) { const browser = await chromium . launch (); const page = await browser . newPage (); await page . goto ( url ); const title = await page . title (); const text = await page . textContent ( " body " ); const links = await page . $ $eval ( " a[href] " , els => els . map ( a => a . href ). filter ( h => h . startsWith ( " http " )) ); const images = await page . $ $eval ( " img[src] " , els => els . map ( img => img . src ) ); await browser . close (); return { title , text : text . slice ( 0 , 1000 ), links , images }; } scrape ( " https://example.com " ). then ( console . log ); This handles: JavaScript-rendered pages (React, Vue, Angular) Auto-waiting for elements All modern browsers (Chrome, Firefox, Safari) Headless or headed mode Want it as an API? I built a free scraping API: GET devtools-site-delta.vercel.app/api/scra

Continue reading on Dev.to Tutorial

Opens in a new tab

Read Full Article
9 views

Related Articles