TechTakes

2065 readers

77 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

dgerard@awful.systems

LLMs average <5% on 2025 Math Olympiad; award each other 20x points (arxiv.org)

submitted 3 months ago by slop_as_a_service@awful.systems to c/techtakes@awful.systems

44 comments fedilink hide all child comments

"Notably, O3-MINI, despite being one of the best reasoning models, frequently skipped essential proof steps by labeling them as "trivial", even when their validity was crucial."

you are viewing a single comment's thread
view the rest of the comments

[+] pennomi@lemmy.world -15 points 3 months ago (30 children)

LLMs are a lot more sophisticated than we initially thought, read the study yourself.

Essentially they do not simply predict the next token, when scientists trace the activated neurons, they find that these models plan ahead throughout inference, and then lie about those plans when asked to say how they came to a conclusion.

[–] bitofhope@awful.systems 22 points 3 months ago (5 children)

Essentially they do not simply predict the next token

looks inside

it's predicting the next token

load more comments (3 replies)

load more comments (27 replies)