this post was submitted on 23 Feb 2026
633 points (97.6% liked)

Technology

81797 readers
4373 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the 'reasoning' models.

you are viewing a single comment's thread
view the rest of the comments
[–] rimu@piefed.social 157 points 1 day ago (7 children)

Very interesting that only 71% of humans got it right.

[–] anomnom@sh.itjust.works 5 points 14 hours ago

The same 29% that keeps fascists in power around the world.

[–] SnotFlickerman@lemmy.blahaj.zone 144 points 1 day ago* (last edited 1 day ago) (2 children)

I mean, I've been saying this since LLMs were released.

We finally built a computer that is as unreliable and irrational as humans... which shouldn't be considered a good thing.

I'm under no illusion that LLMs are "thinking" in the same way that humans do, but god damn if they aren't almost exactly as erratic and irrational as the hairless apes whose thoughts they're trained on.

[–] Peekashoe@lemmy.wtf 36 points 1 day ago

Yeah, the article cites that as a control, but it's not at all surprising since "humanity by survey consensus" is accurate to how LLM weighting trained on random human outputs works.

It's impressive up to a point, but you wouldn't exactly want your answers to complex math operations or other specialized areas to track layperson human survey responses.

[–] CaptDust@sh.itjust.works 52 points 1 day ago* (last edited 1 day ago)

That "30% of population = dipshits" statistic keeps rearing its ugly head.

[–] Lost_My_Mind@lemmy.world 11 points 1 day ago

As someone who takes public transportation to work, SOME people SHOULD be forced to walk through the car wash.

[–] daychilde@lemmy.world 11 points 1 day ago

I'm not afraid to say that it took me a sec. My brain went "short distance. Walk or drive?" and skipped over the car wash bit at first. Then I laughed because I quickly realized the idiocy. :shrug:

[–] LifeInMultipleChoice@lemmy.world -3 points 1 day ago* (last edited 1 day ago) (1 children)

Maybe 29% of people can't imagine owning their own car, so they assumed the would be going there to wash someone elses car

[–] Bronzebeard@lemmy.zip 3 points 16 hours ago (1 children)

Then they can't read. Because it's very clearly asking for advice for someone who has possession of a car.

[–] LifeInMultipleChoice@lemmy.world 1 points 15 hours ago

Yeah, it was a joke. People appear to have had a hard time with catching that though, lol