askchapo

23033 readers

169 users here now

Ask Hexbear is the place to ask and answer ~~thought-provoking~~ questions.

Rules:

Posts must ask a question.
If the question asked is serious, answer seriously.
Questions where you want to learn more about socialism are allowed, but questions in bad faith are not.
Try !feedback@hexbear.net if you're having questions about regarding moderation, site policy, the site itself, development, volunteering or the mod team.

founded 4 years ago

MODERATORS

PorkrollPosadist@hexbear.net

replaceable@hexbear.net

VILenin@hexbear.net

EmmaGoldman@hexbear.net

SexUnderSocialism@hexbear.net

ZoomeristLeninist@hexbear.net

khizuo@hexbear.net

Sulvy@hexbear.net

[Big Yudd free zone] So what do we think about AI Safety guys? (hexbear.net)

submitted 2 weeks ago by FunkyStuff@hexbear.net to c/askchapo@hexbear.net

30 comments fedilink hide all child comments

Prompted by the recent troll post, I've been thinking about AI. Obviously we have our criticisms of both the AI hype manchildren and the AI doom manchildren (see title of the post. This is a Rationalist free post. Looking for it? Leave)

But looking at the AI doom guys with an open mind, sometimes it appear that they make a halfway decent argument that's backed up by real results. This YouTube channel has been talking about the alignment problem for a while, and I think he probably is a bit of a Goodhart's Law merchant (as in, by making a career out of measuring the dangers of AI, his alarmism is structural) so he should be taken with a grain of salt, it does feel pretty concerning that LLMs show inner misalignment and are masking their intentions (to anthropomorphize) under training vs deployment.

Now, I mainly think that these people are just extrapolating out all the problems with dumb LLMs and saying "yeah but if they were AGI it would become a real problem" and while that might be true if taking the premise at face value, the idea that AGI will ever happen is itself pretty questionable. The channel I linked has a video arguing that AGI safety is not a Pascal's mugging, but I'm not convinced.

Thoughts? Does the commercialization of dumb AI make it a threat on a similar scale to hypothetical AGI? Is this all just a huge waste of time to think about?

you are viewing a single comment's thread
view the rest of the comments

[–] FunkyStuff@hexbear.net 10 points 2 weeks ago (3 children)

Roko's basilisk is the dumbest thing ever.

What do you think about the way that these regular (dumb, not AGI) LLMs are starting to develop behaviors that are a little bit more sinister, though? Like this paper describes.

[–] buckykat@hexbear.net 13 points 2 weeks ago

(I ain't readin' all that) but what the abstract describes isn't even close to the worst thing I've read about LLMs doing this week. I don't exactly trust the LLM companies' ideas of what is or is not "harmful." Shit like people using the LLMs as therapists, or worse, oracles is much worse in my opinion, and that doesn't require any "pretend to be evil for training" hijinks.

[–] BountifulEggnog@hexbear.net 4 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Doesn't really strike me as sinister, just annoying for finetuners. They trained a model from the ground up to not be harmful and it tries its best. Even with further training it still retains some of that. To me this paper shows that a model's "goals", what you trained it to do initially, however you want to phrase that, is baked into it and changing that after the fact is hard. Highlights how important early training is I guess.

[–] FunkyStuff@hexbear.net 3 points 2 weeks ago

Kinda problematic that it means we can't ever really be sure that we're catching problematic behavior in the training stage of any AI system, though, right? Sadly I find it hard to think of good uses of LLMs or other genAI outside of capitalism, but if there were any, the fact that it's possible for it to behave duplicitously like that is a pretty big problem.

[–] iie@hexbear.net 3 points 2 weeks ago (1 children)

That's a well-written, readable paper. I can follow it without much background.

[–] FunkyStuff@hexbear.net 3 points 2 weeks ago (1 children)

The funny thing is, I think there's nearly a 0% chance that it isn't mostly AI generated, given who made it.

[–] iie@hexbear.net 3 points 2 weeks ago

lmao