Crawlee Has a Free Web Scraping Framework — Build Reliable Scrapers with Auto-Retry and Proxy Rotation

via Dev.to JavaScriptAlex Spinov2h ago

Why Crawlee? Crawlee (by Apify) is a web scraping framework with automatic retries, proxy rotation, request queuing, and both HTTP and browser-based scraping. npx crawlee create my-scraper cd my-scraper && npm start HTTP Scraping (Fast) import { CheerioCrawler } from ' crawlee ' const crawler = new CheerioCrawler ({ async requestHandler ({ request , $ }) { const title = $ ( ' h1 ' ). text () const price = $ ( ' .price ' ). text () console . log ({ url : request . url , title , price }) }, }) await crawler . run ([ ' https://example.com/product/1 ' , ' https://example.com/product/2 ' ]) Browser Scraping (JavaScript-Heavy Sites) import { PlaywrightCrawler } from ' crawlee ' const crawler = new PlaywrightCrawler ({ async requestHandler ({ page , request }) { await page . waitForSelector ( ' .product-list ' ) const products = await page . $ $eval ( ' .product ' , ( els ) => els . map (( el ) => ({ name : el . querySelector ( ' .name ' )?. textContent , price : el . querySelector ( ' .price

Continue reading on Dev.to JavaScript

Opens in a new tab

Read Full Article

5 views