Back to articles
Scraping Goodreads in 2026: Books, Ratings & Author Data

Scraping Goodreads in 2026: Books, Ratings & Author Data

via Dev.to Tutorialagenthustler

Goodreads has no public API anymore — it was deprecated in 2020. But the site still holds the richest book database on the internet. Here's how to extract the data you need using Python. What You Can Scrape Goodreads serves most of its content as plain HTML. The main data targets are: Book pages — title, author, rating, review count, genres, description, ISBN, publication date Search results — paginated lists of books matching a query Author pages — biography, book list, average rating Lists — curated collections like "Best of 2025" or genre-specific rankings Shelves — user-created collections (public ones only) Setup Install the required packages: pip install requests beautifulsoup4 lxml We'll use lxml as the parser for speed. html.parser works too if you prefer no extra dependencies. Scraping Book Details Every Goodreads book has a URL like goodreads.com/book/show/12345 . Here's how to extract structured data from a book page: import requests from bs4 import BeautifulSoup import json

Continue reading on Dev.to Tutorial

Opens in a new tab

Read Full Article
2 views

Related Articles