I Built 77 Web Scrapers — Here Are the 10 Patterns That Actually Work

After building 77 scrapers, every problem is a variation of the same 10 patterns I've published 77 web scrapers on Apify Store . Reddit, Hacker News, Google News, Trustpilot, YouTube, Bluesky — you name it. Here are the 10 patterns I use in every single one. Pattern 1: Always use sessions # Bad: new connection every request for url in urls : requests . get ( url ) # TCP handshake every time # Good: reuse connection session = requests . Session () for url in urls : session . get ( url ) # Reuses TCP connection Impact: 2-5x faster for multiple requests to the same domain. Pattern 2: Exponential backoff on errors import time def fetch ( url , max_retries = 3 ): for i in range ( max_retries ): try : resp = session . get ( url , timeout = 10 ) if resp . status_code == 429 : time . sleep ( 2 ** i ) continue resp . raise_for_status () return resp except Exception : if i == max_retries - 1 : raise time . sleep ( 2 ** i ) Pattern 3: Extract data with CSS selectors, not XPath from bs4 import Bea

I Built 77 Web Scrapers — Here Are the 10 Patterns That Actually Work

Related Articles

5 Campfire Songs Anyone Can Play on Guitar (Free Chord Charts)

Bybit vs HTX — Which Crypto Exchange Is Better? (2026)

Stop Posting Noise: Building in Public Needs Real Value

We got an audience with the "Lunar Viceroy" to talk how NASA will build a Moon base

Greatings

Related Articles

How-To
5 Campfire Songs Anyone Can Play on Guitar (Free Chord Charts)
Dev.to Beginners • 5h ago

How-To
Bybit vs HTX — Which Crypto Exchange Is Better? (2026)
Dev.to Beginners • 5h ago

How-To
Stop Posting Noise: Building in Public Needs Real Value
Dev.to Beginners • 6h ago

How-To
We got an audience with the "Lunar Viceroy" to talk how NASA will build a Moon base
Ars Technica • 7h ago

How-To
Greatings
Dev.to Tutorial • 7h ago