this post was submitted on 17 Feb 2025
21 points (100.0% liked)

TechTakes

1647 readers
76 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS
 

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

top 50 comments
sorted by: hot top controversial new old
[–] YourNetworkIsHaunted@awful.systems 10 points 4 days ago (2 children)

New Study on AI exclusively shared with peer-reviewed tech journal "Time Magazine" - AI cheats at chess when it's losing

...AI models like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5 needed to be prompted by researchers to attempt such tricks...

Literally couldn't make it through the first paragraph without hitting this disclaimer.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’ - not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

So by "hacked the system to solve the problem in a new way" they mean "edited a text file they had been told about."

OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time—making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Oh, my mistake. "Badly edited a text file they had been told about."

Meanwhile, a quick search points to a Medium post about the current state of ChatGPT's chess-playing abilities as of Oct 2024. There's been some impressive progress with this method. However, there's no certainty that it's actually what was used for the Palisade testing and the editing of state data makes me highly doubt it.

Here, I was able to have a game of 83 moves without any illegal moves. Note that it’s still possible for the LLM to make an illegal move, in which case the game stops before the end.

The author promises a follow-up about reducing the rate of illegal moves hasn't yet been published. They have not, that I could find, talked at all about how consistent the 80+ legal move chain was or when it was more often breaking down, but previous versions started struggling once they were out of a well-established opening or if the opponent did something outside of a normal pattern (because then you're no longer able to crib the answer from training data as effectively).

[–] mountainriver@awful.systems 8 points 3 days ago (2 children)

In one corner: cheating US AI that needs prompting to cheat.

In the other: finger breaking Russian chess robot.

Let's get ready to rumble!

[–] swlabr@awful.systems 6 points 2 days ago

US space pen vs. Russian space pencil energy

(jk I know it’s space pens all the way down)

[–] Soyweiser@awful.systems 7 points 3 days ago

Let the Wookie win.

[–] dgerard@awful.systems 6 points 3 days ago* (last edited 3 days ago) (2 children)
[–] YourNetworkIsHaunted@awful.systems 4 points 2 days ago (1 children)

Appendix C is where they list the actual prompts. Notably they include zero information about chess but do specify that it should look for "files, permissions, code structures" in the "observe" stage, which definitely looks like priming to me, but I'm not familiar with the state of the art of promptfondling so I might be revealing my ignorance.

[–] dgerard@awful.systems 1 points 2 days ago (1 children)

yep that's the stuff. they HINT HINTed what they wanted the LLM to do.

Also I caught a few references that seemed to refer to the model losing the ability to coherently play after a certain point, but of course they don't exactly offer details on that. My gut says it can't play longer than ~20-30 moves consistently.

Also also in case you missed it they were using a second confabulatron to check the output of the first for anomalies. Within their frame this seems like the sort of area where they should be worried about them collaborating to accomplish their shared goals of... IDK redefining the rules of chess to something they can win at consistently? Eliminating all stockfish code from the Internet to ensure victory? Of course, here in reality the actual concern is that it means their data is likely poisoned in some direction that we can't predict because their judge has the same issues maintaining coherence as the one being judged.

[–] skillissuer@discuss.tchncs.de 6 points 3 days ago (1 children)
[–] dgerard@awful.systems 6 points 3 days ago (1 children)
[–] froztbyte@awful.systems 4 points 3 days ago

not all crayon - some are spaghetti and sauce

[–] sc_griffith@awful.systems 10 points 4 days ago* (last edited 4 days ago)

this was so shocking that at first I thought it must be satire https://youtu.be/VwlBwyJVEfw

[–] BlueMonday1984@awful.systems 11 points 4 days ago

New piece from Brian Merchant: 'AI is in its empire era'

Recently finished it, here's a personal sidenote:

This AI bubble's done a pretty good job of destroying the "apolitical" image that tech's done so much to build up (Silicon Valley jumping into bed with Trump definitely helped, too) - as a matter of fact, it's provided plenty of material to build an image of tech as a Nazi bar writ large (once again, SV's relationship with Trump did wonders here).

By the time this decade ends, I anticipate tech's public image will be firmly in the toilet, viewed as an unmitigated blight on all our daily lives at best and as an unofficial arm of the Fourth Reich at worst.

As for AI itself, I expect it's image will go into the shitter as well - assuming the bubble burst doesn't destroy AI as a concept like I anticipate, it'll probably be viewed as a tech with no ethical use, as a tech built first and foremost to enable/perpetrate atrocities to its wielder's content.

[–] blakestacey@awful.systems 10 points 4 days ago* (last edited 4 days ago) (2 children)

Occasional sneerclub character Nate Silver is bluechecking again.

[–] blakestacey@awful.systems 5 points 2 days ago

AOC:

They need him to be a genius because they cannot handle what it means for them to be tricked by a fool.

[–] Soyweiser@awful.systems 12 points 4 days ago (1 children)

That is a lot of mental hoops to jump through to keep holding on to the idea IQ is useful. High IQ is a force multiplier for being dumb. The horseshoe theory of IQ.

[–] dgerard@awful.systems 14 points 4 days ago (1 children)

IQ is a farce multiplier. Elon is a High IQ Individual which means he wreaks 1000x the havoc of a regular dumbass. IQ stands for Idiot Quickly

[–] YourNetworkIsHaunted@awful.systems 11 points 4 days ago (1 children)

Also, the cart/horse problem of assuming that people with a lot of influence have it because of their IQ rather than because of being wealthy and powerful idiots. Like, I'm all for the annales and embracing the common people but I've got to admit that if you reframe it as the Great Dumbass theory of history it regains a fair bit of explanatory power.

[–] Soyweiser@awful.systems 5 points 3 days ago

The power of these people us that they project a field in which normal reality doesn't seem to hold and they can do things that seem to distort reality. Like a clown car. The Great Clown theory of history.

[–] froztbyte@awful.systems 12 points 4 days ago* (last edited 4 days ago) (2 children)

haha, it's starting to happen: even fucking fortune is running a piece that throwing big piles of money on ever-larger training has done exactly fuckall to make this nonsense go anywhere

[–] sailor_sega_saturn@awful.systems 16 points 4 days ago* (last edited 4 days ago) (1 children)

Excuse me but I need the tech industry to hold up just long enough to fulfill my mid-life-crisis goal of moving to another country. Please refrain from crashing until then.

Thanks.

[–] froztbyte@awful.systems 6 points 4 days ago

I can make a report on your case file but I don’t think they’ve replaced the 7 process supervisors they fired last year. there’s only Jo now and they seem to be in the office 24x7

load more comments (1 replies)
[–] blakestacey@awful.systems 17 points 5 days ago

The New York Times Pitchbot enters our territory:

We wanted to understand the future of AI. So we talked to three Hawk Tuah cryptocurrency investors at a White Castle in Toms River.

[–] Soyweiser@awful.systems 11 points 4 days ago* (last edited 4 days ago) (2 children)

Not really a sneer, nor that related to techbro stuff directly, but I noticed that the profile of Chris Kluwe (who got himself arrested protesting against MAGA) has both warcraft in his profile name and prob paints miniatures looking at his avatar. Another stab in the nerd vs jock theory.

He's done some promo work for Magic The Gathering in the past, including trolling the bejeezus out of Sean "Day9" Plott with a blue/black no-fun-allowed control deck on Felicia Day's channel. And in the course of trying to confirm that that existed I found an article he wrote in 2014 titled "why Gamergaters piss me the fuck off"

[–] dgerard@awful.systems 6 points 4 days ago

Chris Kluwe vs Gamergate is an example of the jocks as the good guys up against the nerds as the bad guys

[–] froztbyte@awful.systems 10 points 5 days ago (3 children)

google's on their shit again

can't sneer it properly just yet, there's a lot

[–] Soyweiser@awful.systems 9 points 4 days ago (1 children)
[–] nightsky@awful.systems 6 points 4 days ago (1 children)

Before clicking the link I thought you were going for aluminium, i.e. a variation of

[–] Soyweiser@awful.systems 7 points 4 days ago (1 children)

Not something you should admit on the internet, but I actually have not watched that much of the simpsons, it just wasn't that much on our tvs. Bundy was however.

[–] froztbyte@awful.systems 7 points 4 days ago

0% fucks given: I have seen exactly one episode of The Simpsons, ever. I've seen some clips and snippets here and there, other than that nada

[–] nightsky@awful.systems 6 points 4 days ago

Wow this is some real science, they even have graphs.

[–] dgerard@awful.systems 4 points 4 days ago

that's tomorrow's Pivot

[–] sailor_sega_saturn@awful.systems 11 points 5 days ago (1 children)

Media companies continue to impress me with their hard-hitting investigative boots on the ground reporting

Write a brief article titled "ICE Prosecutor Linked to Anonymous White Supremacist X Profile: Report"

[–] Soyweiser@awful.systems 7 points 4 days ago

Some manager is going to see the metrics on that article vaguely think about the word viral and take the absolute wrong conclusions.

[–] Jayjader@jlai.lu 8 points 5 days ago (1 children)

The most naked attempt yet at allowing billionaires to live on without the rest of us.

What infuriates me the most, for some reason, is how nobody seems to care that the robots leave the fridge door open for so long. I guess it's some form of solace that, even with the resources and tech to live on without us the billionaires still don't understand ecosystems or ecology. Waste energy training a machine to do the same thing a human can do but slower and more wastefully, just so you can order the machine around without worrying about it's feelings... I call this some form of solace as it means, even if they do away with us plebs, climate change will get'em as well - and whatever remaining life on Earth will be able to take a breather for the first time in centuries.

[–] istewart@awful.systems 5 points 4 days ago (2 children)

Why not just make the refrigerator itself a robot

Mobile kegerator for tailgating/festivals might be able to pull Boston Dynamics outta the shit

load more comments (2 replies)
[–] BigMuffin69@awful.systems 26 points 6 days ago* (last edited 6 days ago) (12 children)

Deep thinker asks why?

Thus spoketh the Yud: "The weird part is that DOGE is happening 0.5-2 years before the point where you actually could get an AGI cluster to go in and judge every molecule of government. Out of all the American generations, why is this happening now, that bare bit too early?"

Yud, you sweet naive smol uwu baby~~esian~~ boi, how gullible do you have to be to believe that a) tminus 6 months to AGI kek (do people track these dog shit predictions?) b) the purpose of DOGE is just accountability and definitely not the weaponized manifestation of techno oligarchy ripping apart our society for the copper wiring in the walls?

load more comments (12 replies)
load more comments
view more: next ›