How I Built a Soccer Coach Contact Extractor for Messy Athletics Websites

Most athletics websites look simple until you try to extract structured data from them at scale. Coach pages are especially messy. One school gives you a clean staff directory with mailto: links. Another hides emails behind Cloudflare. Another puts names on the roster page and the actual contact info on a separate bio page. Another sends back an empty shell and expects JavaScript to do the rest. That is what this project solves. football-soccer-emails is a TypeScript-based extractor that pulls soccer and football coach contact information from athletics websites and turns it into structured records. It supports direct URLs, public Google Sheets, and an Apify workflow for batch runs. The reason I built it this way is simple: I tried a version of this problem around 2017 or 2018 using heuristics only, and it was roughly 40% accurate. That was about as far as rules alone would take me. With LLMs and a multi-stage extraction flow, this same class of problem can now get into the 90%+ range.

How I Built a Soccer Coach Contact Extractor for Messy Athletics Websites

Related Articles

Slopification and its Discontents

Instruction Best Practices: Precision Beats Clarity

Cauldron Ferm has turned microbes into nonstop assembly lines

Spotify’s new SongDNA feature maps how your favorite songs are connected

Zoox is bringing its robotaxis to Austin and Miami

Related Articles

News
Slopification and its Discontents
Lobsters • 1h ago

News
Instruction Best Practices: Precision Beats Clarity
Dev.to • 1h ago

News
Cauldron Ferm has turned microbes into nonstop assembly lines
TechCrunch • 1h ago

News
Spotify’s new SongDNA feature maps how your favorite songs are connected
TechCrunch • 1h ago

News
Zoox is bringing its robotaxis to Austin and Miami
The Verge • 1h ago