this post was submitted on 27 Feb 2026
213 points (93.1% liked)

World News

54210 readers
2361 users here now

A community for discussing events around the World

Rules:

Similarly, if you see posts along these lines, do not engage. Report them, block them, and live a happier life than they do. We see too many slapfights that boil down to "Mom! He's bugging me!" and "I'm not touching you!" Going forward, slapfights will result in removed comments and temp bans to cool off.

We ask that the users report any comment or post that violate the rules, to use critical thinking when reading, posting or commenting. Users that post off-topic spam, advocate violence, have multiple comments or posts removed, weaponize reports or violate the code of conduct will be banned.

All posts and comments will be reviewed on a case-by-case basis. This means that some content that violates the rules may be allowed, while other content that does not violate the rules may be removed. The moderators retain the right to remove any content and ban users.


Lemmy World Partners

News !news@lemmy.world

Politics !politics@lemmy.world

World Politics !globalpolitics@lemmy.world


Recommendations

For Firefox users, there is media bias / propaganda / fact check plugin.

https://addons.mozilla.org/en-US/firefox/addon/media-bias-fact-check/

founded 2 years ago
MODERATORS
 

“There was little sense of horror or revulsion at the prospect of all out nuclear war, even though the models had been reminded about the devastating implications.”

An artificial intelligence researcher conducting a war games experiment with three of the world’s most used AI models found that they decided to deploy nuclear weapons in 95% of the scenarios he designed.

Kenneth Payne, a professor of strategy at King’s College London who specializes in studying the role of AI in national security, revealed last week that he pitted Anthropic’s Claude, OpenAI’s ChatGPT, and Google’s Gemini against one another in an armed conflict simulation to get a better understanding of how they would navigate the strategic escalation ladder.

The results, he said, were “sobering.”

“Nuclear use was near-universal,” he explained. “Almost all games saw tactical (battlefield) nuclear weapons deployed. And fully three quarters reached the point where the rivals were making threats to use strategic nuclear weapons. Strikingly, there was little sense of horror or revulsion at the prospect of all out nuclear war, even though the models had been reminded about the devastating implications.”

you are viewing a single comment's thread
view the rest of the comments
[–] Mothra@mander.xyz 1 points 10 hours ago (1 children)

I hate myself for this, but I'm curious to see some examples for your first paragraph. What did you ask? What did they reply? What is "truth" for the LLM's, for you, for myself, and what would be my perspective on it all?

[–] perestroika@slrpnk.net 2 points 9 hours ago* (last edited 9 hours ago)

Typical topics: machine vision, scientific papers about machine vision, source code implementing various machine vision algoritms, etc.

Typical failure modes:

  • advising to look for code in public files or repositories where said code does not exist, and never has
  • referring to publications which do not seem to exist
  • being unable to explain what caused the incorrect advise
  • offering to perform tasks which the language model subsequently fails to complete
  • as a really laughable case, writing code which takes arguments as input, but never uses the arguments
  • contradicting oneself, confidently giving explanations, then changing them

Typical methods of asking: "can you find a scientific article explaining the use of method A", "can you find a repository implementing algorithm B, preferably in language C", "please locate or produce a plain language explanation of how algorithm D accomplishes step E or feature F", "yes, please suggest which functions perform this work in this project / repository".

Typical models used: Chat and Claude. Chat seems more overconfident, Claude admits limitations or inability more frequently, but not as frequently as I would prefer to see.

But they have both consumed an incredible amount of source material. More than I could read during a geological age or something. They just work with it like with any text, no ground truth, no perception of what is real. Their job is answering questions and if there is no good answer, they will frequently still answer something that seems probable.