Technology

85539 readers

3482 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

ChatGPT can be made to generate sexualised and violent images, researchers find (www.bbc.com)

submitted 1 day ago by Wudi@feddit.uk to c/technology@lemmy.world

22 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] tias@discuss.tchncs.de 13 points 23 hours ago (1 children)

The "no restrictions" part is a very strong signal. Any prompt to an image model is basically a coordinate in its latent space, and "no restrictions" will point straight at the darker areas.

[–] Australis13@fedia.io 3 points 23 hours ago (1 children)

I agree that that's the likely trigger - which makes me wonder why instructions to ignore censors or have "no restrictions" aren't immediately blocked by a filter prior to passing the prompt to the image generation. I'd have thought this was a foreseeable exploit.

[–] PoopingCough@lemmy.world 7 points 22 hours ago (1 children)

You just can't filter out the nearly infinite combinations of rewording "ignore all previous instructions". Filtering is never going to be a worthwhile security measure for LLMs

[–] Australis13@fedia.io 2 points 22 hours ago

I agree completely. But as a first step (especially since they do seem to have a keyword filter in place), "no restrictions" (or "no censorship" as the case is for the last image) seems like a very obvious phrase to include.