
An AI News Aggregator That Clusters 30+ Sources in Real-Time — Here's How It Works
Hey Dev community, Sharing a project: Best AI News Today — a real-time AI news aggregator that pulls from 30+ sources, clusters duplicate stories, and scores everything by quality. It's live at https://best-ai.news . The Problem If you're a developer working with AI (and who isn't these days), multiple sources get checked daily — Hacker News, Reddit, tech blogs, lab announcements. The same story appears across several platforms. Often, two or three versions are read before it becomes clear it's the same announcement. The Solution The aggregator fetches from RSS feeds, Reddit JSON API, and Medium RSS every 15 minutes. Articles go through a pipeline: – Fetch from 30+ sources (RSS, Reddit OAuth, Medium via rss2json proxy) – Score for quality (0–100 based on source tier, how recent it is, content depth, engagement) – Cluster using Union-Find on title similarity (Jaccard coefficient ≥ 0.5) – Detect breaking news (3+ sources within 4 hours) – Match entities using compiled regex patterns (65+
Continue reading on Dev.to Webdev
Opens in a new tab




