Scraping Goodreads in 2026: Books, Ratings & Author Data

Goodreads has no public API anymore — it was deprecated in 2020. But the site still holds the richest book database on the internet. Here's how to extract the data you need using Python. What You Can Scrape Goodreads serves most of its content as plain HTML. The main data targets are: Book pages — title, author, rating, review count, genres, description, ISBN, publication date Search results — paginated lists of books matching a query Author pages — biography, book list, average rating Lists — curated collections like "Best of 2025" or genre-specific rankings Shelves — user-created collections (public ones only) Setup Install the required packages: pip install requests beautifulsoup4 lxml We'll use lxml as the parser for speed. html.parser works too if you prefer no extra dependencies. Scraping Book Details Every Goodreads book has a URL like goodreads.com/book/show/12345 . Here's how to extract structured data from a book page: import requests from bs4 import BeautifulSoup import json

Scraping Goodreads in 2026: Books, Ratings & Author Data

Related Articles

7 Backend Developer Skills That Will Make You Valuable

Tutorial Hell

Reverse a Linked List

The 5 Grammar Rules Even Good Writers Get Wrong

I Tracked 6 Months of Pomodoro Sessions: Here's What the Data Shows

Related Articles

How-To
7 Backend Developer Skills That Will Make You Valuable
Medium Programming • 3h ago

How-To
Tutorial Hell
Medium Programming • 4h ago

How-To
Reverse a Linked List
Dev.to Tutorial • 4h ago

How-To
The 5 Grammar Rules Even Good Writers Get Wrong
Dev.to Tutorial • 6h ago

How-To
I Tracked 6 Months of Pomodoro Sessions: Here's What the Data Shows
Dev.to Beginners • 6h ago