FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources
  • Privacy Policy

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
Show HN: Sup AI, a confidence-weighted ensemble (52.15% on Humanity's Last Exam)
How-ToMachine Learning

Show HN: Sup AI, a confidence-weighted ensemble (52.15% on Humanity's Last Exam)

via Hacker Newssupai1w ago

Hi HN. I'm Ken, a 20-year-old Stanford CS student. I built Sup AI. I started working on this because no single AI model is right all the time, but their errors don’t strongly correlate. In other words, models often make unique mistakes relative to other models. So I run multiple models in parallel and synthesize the outputs by weighting segments based on confidence. Low entropy in the output token probability distributions correlates with accuracy. High entropy is often where hallucinations begin. My dad Scott (AI Research Scientist at TRI) is my research partner on this. He sends me papers at all hours, we argue about whether they actually apply and what modifications make sense, and then I build and test things. The entropy-weighting approach came out of one of those conversations. In our eval on Humanity's Last Exam, Sup scored 52.15%. The best individual model in the same evaluation run got 44.74%. The relative gap is statistically significant (p < 0.001). Methodology, eval code, d

Continue reading on Hacker News

Opens in a new tab

Read Full Article
0 views

Related Articles

The Boring Skills That Make Developers Unstoppable in 2026
How-To

The Boring Skills That Make Developers Unstoppable in 2026

Medium Programming • 1d ago

I Installed This VS Code Extension… and My Code Got Instantly Better
How-To

I Installed This VS Code Extension… and My Code Got Instantly Better

Medium Programming • 1d ago

The Age of Personalized Software
How-To

The Age of Personalized Software

Medium Programming • 1d ago

Automating Checkout Add-On Recommendations in WordPress for WooCommerce
How-To

Automating Checkout Add-On Recommendations in WordPress for WooCommerce

Dev.to • 1d ago

How-To

Start Here: Learning to develop your own way with SCSIC

Medium Programming • 2d ago

Discover More Articles