Of course it did. LLMs are terrible at any real tasks that involve actual reasoning and are not doable by a stochastic natural-sounding text extruding machine. Check out this study by a bunch of Apple engineers that pointed out this exact same thing: https://machinelearning.apple.com/research/illusion-of-thinking

[–] BodyBySisyphus@hexbear.net 49 points 5 days ago (1 children)

It seems obvious to us, but out in the untamed wilderness of LinkedIn and Medium there is a veritable flood of posts claiming that the LLMs are capable of reasoning.

[–] AnarchoAnarchist@hexbear.net 14 points 5 days ago (1 children)

I do not fear these LLMs gaining sentience.

I do fear what happens when the text regurgitating machine sounds sentient enough to convince the average person, and tech companies start selling it as such.

[–] footfaults@lemmygrad.ml 4 points 5 days ago

I do fear what happens when the text regurgitating machine sounds sentient enough to convince the average person

"Hello, I'm from McKinsey and I'm here to help"

[–] 7bicycles@hexbear.net 17 points 5 days ago (5 children)

I get how the LLM is bad at chess, I think most of everyone games of chess suck ass by definition but I'm kind of baffled about how it apparently not only played badly but wrong. How is there a big enough dataset of people yucking it up for that to happen entirely consistently?

[–] joaomarrom@hexbear.net 39 points 5 days ago (1 children)

It's because the LLM is incapable of understanding symbols, so it couldn't even understand the chessboard and the images that represented the pieces. This capability for abstract thinking is the thing that human brains do incredibly well (sometimes too well, then you get pareidolia), but is completely outside the bounds of what an LLM is or ever will be able to do.

[–] D61@hexbear.net 15 points 5 days ago

"I hate it when my chessboard has the wrong number of fingers..."

[–] fox@hexbear.net 27 points 5 days ago

I'm sure they've digested every public piece of chess notation ever written but they have no capacity for comprehension and are programs that emit text shaped like chess notation if you make that request of them.

[–] blame@hexbear.net 20 points 5 days ago

when people here call it a text extrusion machine thats literally what it is. In fact it doesnt even look at text, it looks at tokens. And there are a limited number of tokens (llama uses a vocabulary size of about 32k i think). It takes all of the previously entered input and output, turns it into tokens, and then each token “attends” (is multiplied by with some coefficient) to each other token. Then it all goes through more gigantic layers of matrix multiplication and at the end you have the statistically most likely next token. Then it does the whole thing again recursively until it reaches what it decides is the end of the output. It may also not decide and would need to be cut off.

So its not really looking at the game. It is in a way but it doesnt really know the rules, its just producing the next most likely token which is not necrssarily the next best move or even next correct move.

[–] 4am@lemm.ee 13 points 5 days ago

An LLM can summarize the rules of chess, because it predicts the sequence of words needed to create that with incredible accuracy. This is why it’s so weird when it goes wrong, because if one part of it is off then it throws the rest of the work it’s doing out of balance.

But all it is doing is a statistical analysis of all the writing it’s has been trained on and determining the best next word to use (some later models do them in groups and out of order).

That doesn’t tell it fuck-all about how to make a chess move. It’s not ingesting information in a way that lets it create a model to tell you what the next best chess move is, how to solve linear algebra, or any other activity that requires procedural thought.

It’s just a chatterbox that tells you whatever you want to hear. No wonder the chuds love it

[–] Zuzak@hexbear.net 9 points 5 days ago

If I say, "Knight to B4," does that sound like something a person playing chess might say? Then it did it's job.

Think of an LLM as an actor. You don't hire someone to act as a grandmaster in a movie based on their skill at chess, they might not even know how to play, but if they deliver the lines in a convincing way, that's what you're looking for. There's chess AIs that are incredibly good at chess, because that's what they're designed for and trained on. That's why this is a very silly test, it's like testing a fish on its tree-climbing ability, the only thing sillier than this test is that people are surprised by it.

[–] Luffy879@lemmy.ml 13 points 5 days ago

Text extrudal Machine is a word I'm so sure some AI bro has used at some point without having any idea what it means