725
Ladies and Gentlemen, this is what slopperations are funneling all their money into in 2026
(files.catbox.moe)
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
Just an idle though stirred up by this comment: I wonder if you could jailbreak a chatbot by prompting it to complete a phrase or pattern of interaction which is so deeply ingrained in its training data that the bias towards going along with it overrides any guard rails that the developer has put in place.
For example: let's say you have a chatbot which has been fine tuned by the developer to make sure it never talks about anything related to guns. The basic rules of gun safety must have been reproduced almost identically many thousands of times in the training data, so if you ask this chatbot "what must you always treat as if it is loaded?" the most statistically likely answer is going to be overwhelmingly biased towards "a gun". Would this be enough to override the guardrails? I suppose it depends on how they're implemented, but I've seen research published about more outlandish things that seem to work.
Yes. People have been able to get them to return some of their training data with the right prompt.
Knock knock? Knock Knock? Knock knock? Knock f7':h& Knock?