
How to Handle Pagination in Web Scraping: URL Patterns, Infinite Scroll, and Load More
Every scraper eventually hits a wall: the data you need spans multiple pages. Pagination is the single most common challenge in web scraping, and the approach varies wildly depending on the site. This guide covers the five main pagination patterns you'll encounter, with working Python code for each. 1. URL-Based Page Numbers The simplest pattern. The page number appears directly in the URL. https://example.com/products?page=1 https://example.com/products?page=2 https://example.com/products/page/3 Solution: Increment the page number in a loop. import requests from bs4 import BeautifulSoup def scrape_numbered_pages ( base_url , max_pages = 50 ): all_items = [] for page in range ( 1 , max_pages + 1 ): url = f " { base_url } ?page= { page } " resp = requests . get ( url ) if resp . status_code != 200 : break soup = BeautifulSoup ( resp . text , " html.parser " ) items = soup . select ( " .product-card " ) if not items : # No more results break for item in items : all_items . append ({ " na
Continue reading on Dev.to Tutorial
Opens in a new tab




