Close Menu
Newstech24.com
  • Home
  • News
  • Arabic News
  • Technology
  • Economy & Business
  • Sports News
What's Hot

2025 College football bowl season betting guide: Picks, predictions and plenty of underdogs

23/12/2025

Sources: NBA looking into new ways to prevent teams from tanking

23/12/2025

GitLab: Downgrade To Hold On Near-Term Weakness

23/12/2025
Facebook Tumblr
Tuesday, December 23
Facebook X (Twitter) Instagram
Newstech24.com
  • Home
  • News
  • Arabic News
  • Technology
  • Economy & Business
  • Sports News
Newstech24.com
Home»Technology»Amazon’s wager that AI benchmarks don’t matter
Technology

Amazon’s wager that AI benchmarks don’t matter

By Admin03/12/2025No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Amazon’s bet that AI benchmarks don’t matter
Share
Facebook Twitter LinkedIn Pinterest Email

That is an excerpt of Sources by Alex Heath, a e-newsletter about AI and the tech business, syndicated only for The Verge subscribers as soon as per week.

Amazon’s AI chief has a message for the mannequin benchmark obsessives: Cease trying on the leaderboards.

“I need real-world utility. None of those benchmarks are actual,” Rohit Prasad, Amazon’s SVP of AGI, instructed me forward of as we speak’s bulletins at AWS re:Invent in Las Vegas. “The one solution to do actual benchmarking is that if everybody conforms to the identical coaching information and the evals are fully held out. That’s not what’s taking place. The evals are frankly getting noisy, they usually’re not displaying the actual energy of those fashions.”

It’s a contrarian stance when each different AI lab is fast to boast about how their new fashions rapidly climb the leaderboards. It’s additionally handy for Amazon, on condition that the earlier model of Nova, its flagship mannequin, was sitting at spot 79 on LMArena when Prasad and I spoke final week. Nonetheless, dismissing benchmarks solely works if Amazon can provide a distinct story about what progress appears like.

“They’re not displaying the actual energy of those fashions.”

The centerpiece of as we speak’s re:Invent bulletins is Nova Forge, a service that Amazon claims lets corporations practice customized AI fashions in methods beforehand inconceivable with out spending billions of {dollars}. The issue Forge addresses is actual. Most corporations attempting to customise AI fashions face three unhealthy choices: fine-tune a closed mannequin (however solely on the edges), practice on open-weight fashions (however with out the unique coaching information and risking functionality regression, the place the AI turns into an skilled on new information however forgets authentic, broader abilities), or construct a mannequin from scratch at huge value.

Forge provides one thing else: entry to Amazon’s Nova mannequin checkpoints on the pre-training, mid-training, and post-training levels. Corporations can inject their proprietary information early within the course of, when the mannequin’s “studying capability is highest,” as Prasad put it, slightly than simply tweaking mannequin habits on the finish.

“What we have now completed is democratize AI and frontier mannequin growth on your use circumstances at fractions of what it could value [before],” Prasad mentioned. Forge was created as a result of Amazon’s inner groups needed a instrument to inject their area experience right into a base mannequin with out having to construct from scratch.

“We constructed Forge as a result of our inner groups needed Forge,” he mentioned. It’s a well-known Amazon sample. AWS itself famously started as infrastructure constructed for Amazon’s personal retail operation earlier than changing into the corporate’s revenue engine.

Reddit has been utilizing Forge to construct customized security fashions skilled on 23 years of group moderation information. “I haven’t seen something prefer it but,” Chris Slowe, Reddit’s CTO and first worker, instructed me. “We’ve had a distinguished engineer who’s simply been like a child within the sweet store.”

Slowe mentioned Reddit ran a continued pre-training job final week that’s “trying actually promising.” The objective: Substitute a number of bespoke security fashions with a single Reddit-expert mannequin that understands the nuances of group moderation, together with the notoriously subjective rule that seems throughout subreddits all over the place: “Don’t be a jerk.”

“Having an skilled mannequin, it’s going to know the group,” Slowe mentioned. “It’s gonna have a fairly good notion of what jerk means.”

That’s the thread Amazon needs builders to drag on: not uncooked IQ factors, however management and specialization.

He defined that Forge allows Reddit to manage its fashions, keep away from surprises from API modifications, retain possession of its weights, and keep away from sending delicate information to third-party mannequin suppliers. He mentioned Reddit is already exploring utilizing the identical strategy for Reddit Solutions and different merchandise.

After I requested Slowe whether or not it mattered that Nova isn’t a top-tier mannequin on benchmarks, he was blunt: “On this context, what issues is the Reddit expertness of the mannequin.” That’s the thread Amazon needs builders to drag on: not uncooked IQ factors, however management and specialization.

With Forge, Amazon is making a calculated wager that the mannequin race has commoditized and that it may well succeed by being the place the place corporations can construct specialised AI for particular enterprise issues. It’s a really AWS-shaped view of the world: infrastructure over intelligence and customization over uncooked functionality. The technique additionally lets Amazon sidestep direct comparisons with OpenAI and Anthropic, each of which it as soon as hoped to compete with on the mannequin layer.

Whether or not Forge is genuinely pioneering or simply intelligent positioning relies upon, in fact, on developer adoption. Amazon insists that the mannequin race, because it’s extensively understood, doesn’t matter. If that finally ends up being true, the scoreboard shifts to one thing a lot quieter and more durable to sport: whether or not AI fashions really ship real-world utility.

Observe subjects and authors from this story to see extra like this in your personalised homepage feed and to obtain electronic mail updates.

  • Alex Heath

    Alex Heath

    Sources writer, Verge contributor

    Posts from this writer shall be added to your day by day electronic mail digest and your homepage feed.

    See All by Alex Heath

  • AI

    Posts from this subject shall be added to your day by day electronic mail digest and your homepage feed.

    See All AI

  • Column

    Posts from this subject shall be added to your day by day electronic mail digest and your homepage feed.

    See All Column

  • Sources

    Posts from this subject shall be added to your day by day electronic mail digest and your homepage feed.

    See All Sources


{content material}

Supply: {feed_title}

Like this:

Like Loading...

Related

Amazons Benchmarks bet Dont matter
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Admin
  • Website

Related Posts

The tooth fairy just got reimagined for the tech-savvy generation

23/12/2025

How AI broke the smart home in 2025 

23/12/2025

The year the government broke

23/12/2025
Leave A Reply Cancel Reply

Don't Miss
Sports
3 Mins Read

2025 College football bowl season betting guide: Picks, predictions and plenty of underdogs

By Admin23/12/20253 Mins Read

Pamela MaldonadoDec 23, 2025, 06:47 AM ETClosePamela Maldonado is a sports betting analyst for ESPN.Bowl…

Like this:

Like Loading...

Sources: NBA looking into new ways to prevent teams from tanking

23/12/2025

GitLab: Downgrade To Hold On Near-Term Weakness

23/12/2025

The tooth fairy just got reimagined for the tech-savvy generation

23/12/2025

MOD cites family life and civilian jobs in troop exits

23/12/2025

The 25 most college football moments of the 21st century

23/12/2025

Powerball jackpot hits $1.6B | Fox Business

23/12/2025

How AI broke the smart home in 2025 

23/12/2025

NFL Week 16: What went wrong for Lions, Bucs, Ravens, Colts?

23/12/2025

هل تصنع الصفقات المجانية عوائد تفوق الإنفاق الضخم في البريميرليج؟

23/12/2025
Advertisement
About Us
About Us

NewsTech24 is your premier digital news destination, delivering breaking updates, in-depth analysis, and real-time coverage across sports, technology, global economics, and the Arab world. We pride ourselves on accuracy, speed, and unbiased reporting, keeping you informed 24/7. Whether it’s the latest tech innovations, market trends, sports highlights, or key developments in the Middle East—NewsTech24 bridges the gap between news and insight.

Company
  • Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Disclaimer
  • Terms Of Use
Latest Posts

2025 College football bowl season betting guide: Picks, predictions and plenty of underdogs

23/12/2025

Sources: NBA looking into new ways to prevent teams from tanking

23/12/2025

GitLab: Downgrade To Hold On Near-Term Weakness

23/12/2025

The tooth fairy just got reimagined for the tech-savvy generation

23/12/2025

MOD cites family life and civilian jobs in troop exits

23/12/2025
Newstech24.com
Facebook X (Twitter) Tumblr Threads RSS
  • Home
  • News
  • Arabic News
  • Technology
  • Economy & Business
  • Sports News
© 2025 ThemeSphere. Designed by ThemeSphere.

Type above and press Enter to search. Press Esc to cancel.

%d