
I got rate-limited scraping 100 pages. Here's what actually worked
I got rate-limited scraping 100 pages. Here's what actually worked Broke a scraper last Tuesday because I was too impatient. Hit rate limits on page 47 of 100, lost all the data, had to start over. Fun times. The Problem I needed product data from an e-commerce site. Simple job - name, price, availability. But their API was locked behind enterprise pricing ($500/month, no thanks), so scraping it was. First attempt: blasted through requests as fast as possible. import requests from bs4 import BeautifulSoup for page in range ( 1 , 101 ): response = requests . get ( f ' https://example.com/products?page= { page } ' ) soup = BeautifulSoup ( response . text , ' html.parser ' ) # Extract data... Result: banned at page 47. Zero data collected. What Actually Worked Three changes made it work: 1. Add random delays import time import random time . sleep ( random . uniform ( 2 , 5 )) # 2-5 second delays 2. Rotate user agents user_agents = [ ' Mozilla/5.0 (Windows NT 10.0; Win64; x64)... ' , ' Moz
Continue reading on Dev.to Tutorial
Opens in a new tab


