22
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 18 Nov 2024
22 points (100.0% liked)
TechTakes
1437 readers
122 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 1 year ago
MODERATORS
Dude discovers that one LLM model is not entirely shit at chess, spends time and tokens proving that other models are actually also not shit at chess.
The irony? He's comparing it against Stockfish, a computer chess engine. Computers playing chess at a superhuman level is a solved problem. LLMs have now slightly approached that level.
Writeup https://dynomight.net/more-chess/
HN discussion https://news.ycombinator.com/item?id=42206817
I remember when several months (a year ago?) when the news got out that gpt-3.5-turbo-papillion-grumpalumpgus could play chess around ~1600 elo. I was skeptical the apparent skill wasn't just a hacked-on patch to stop folks from clowning on their models on xitter. Like if an LLM had just read the instructions of chess and started playing like a competent player, that would be genuinely impressive. But if what happened is they generated 10^12 synthetic games of chess played by stonk fish and used that to train the model- that ain't an emergent ability, that's just brute forcing chess. The fact that larger, open-source models that perform better on other benchmarks, still flail at chess is just a glaring red flag that something funky was going on w/ gpt-3.5-turbo-instruct to drive home the "eMeRgEnCe" narrative. I'd bet decent odds if you played with modified rules, (knights move a one space longer L shape, you cannot move a pawn 2 moves after it last moved, etc), gpt-3.5 would fuckin suck.
Edit: the author asks "why skill go down tho" on later models. Like isn't it obvious? At that moment of time, chess skills weren't a priority so the trillions of synthetic games weren't included in the training? Like this isn't that big of a mystery...? It's not like other NN haven't been trained to play chess...