32
submitted 11 months ago* (last edited 11 months ago) by LeylaLove@hexbear.net to c/the_dunk_tank@hexbear.net

Never talk morals with libs because these are the things they spend their time questioning. Whether or not it's okay to use AI to detect CP. Real thinker there

https://lemm.ee/post/12171882

top 18 comments
sorted by: hot top controversial new old
[-] LanyrdSkynrd@hexbear.net 24 points 11 months ago

Google already has an ML model that detects this stuff, and they use it to scan everyone's private Google photos.

https://www.eff.org/deeplinks/2022/08/googles-scans-private-photos-led-false-accusations-child-abuse

The must have classified and used a bunch of child porn to train the model and I have no problem with that, it's not generating new CP or abusing anyone. I'm more uncomfortable with them running all our photos through an AI model and sending the results to the US government and not telling the public.

[-] WayeeCool@hexbear.net 7 points 11 months ago* (last edited 11 months ago)

They just run it on photos stored on their servers. Microsoft, Apple, Amazon, and Dropbox also do the same. There are also employees in their security departments with the fkd up job of having to verify anything flagged then alert law enforcement.

Everyone always forgets that "cloud storage" means files are stored on someone else's machine. I don't think anyone, even soulless companies like Google or Microsoft want to be hosting CSAM. So it is understandable that they scan the contents of Google Photos or Microsoft OneDrive, even if they didn't have a legal obligation there is a moral one.

[-] chickentendrils@hexbear.net 21 points 11 months ago

Seems pretty cut and dry to me. As a tool for moderators to verify, rather than an unwilling witness having to report it.

[-] LeylaLove@hexbear.net 13 points 11 months ago

Exactly. Like why is this an "ethical question"?

[-] has_com@hexbear.net 2 points 11 months ago

Actually kenyan workers HAVE TO be traumatized for CSA to end

[-] HornyOnMain@hexbear.net 17 points 11 months ago

196 and anti anti CP takes, name a more iconic duo

[-] kristina@hexbear.net 14 points 11 months ago* (last edited 11 months ago)

when they talk about this, there are identifiers that detect it and remove it automatically, you arent actually storing it in any way. this is standard operation for any major website.

yall really need to stop reading 'AI' and having your brains shut off in general, not really referring to this case just in general

[-] LeylaLove@hexbear.net 13 points 11 months ago

Hash lists exist yeah. But American law actually requires website hosts to keep the CP for evidence instead of deleting it. It's why DivideBy0's tool isn't supposed to be used for American Lemmy instances. Like if you upload a flagged image to Google drive, Google is supposed to flag it, save it, and call the cops.

[-] kristina@hexbear.net 13 points 11 months ago* (last edited 11 months ago)

i get that its supposed to be for evidence but its really fucked up to have to put small time server owners through that shit, terrible law. got to be some other way to handle that

[-] LeylaLove@hexbear.net 12 points 11 months ago

I agree. The DivideBy0 tool should be standard on here. Instantly deleting it when its uploaded and saving post ip is the best solution. More just explaining that anybody in the position to make a tool like that wouldn't have to go out of their way to get source material because legally speaking, they should already have some. There are site hosts that ignore this law and just delete and ban instantly (as they should), but I think it's important to explain why these tech companies just happen to have large repositories of CP to train AI on.

[-] kristina@hexbear.net 10 points 11 months ago* (last edited 11 months ago)

hexbear doesnt log ip at all afaik, security risk

[-] ModernRisk@lemmy.dbzer0.com 13 points 11 months ago

Isn’t that a good thing? Quicker to find it, remove it and hopefully find the one who’s spreading it and sent them to prison.

[-] LeylaLove@hexbear.net 6 points 11 months ago

It is a great thing, hence why I'm posting this. Why the fuck is there anybody thinking about the moral implications of using AI to handle CP? What moral implications? What's wrong with it?

[-] drhead@hexbear.net 12 points 11 months ago

Bottom comment is technically correct, you can bypass any dataset-related ethical concerns. You could very easily make a classifier model that just tries to find age and combine it with one of many existing NSFW classifiers, flagging any image that scores high for both.

But there are already organizations which have large databases of CSAM that are suitable for this purpose, using it to train a model would not create any additional demand, it would not result in any more harm, and it would likely result in a better model. Keep in mind that these same organizations already use these images in the form of a perceptual hash database that social media and adult content sites can check uploaded images against to see if they are known images while also not sharing the images themselves. This is just a different use of the data for a similar purpose.

The only actual problem I could think of would be if people trust its outputs blindly instead of manually reviewing images and start reporting everything that scores high directly to the police, but that is more of a problem with inappropriate use of the model than it is with the training or existence of the model. It's very safe to respond like this to images flagged by NCMEC's phash database because those are known images and if any false positives happen they have the originals to compare to so they can be cleared up, but even if you get a 99% accurate classifier model, you will still have something that is orders of magnitude more prone to false positives than the phash database, and it can be very difficult to find out why it generates false positives and correct the problem, because... well, it involves extensive auditing of the dataset. I don't think this is enough reason to not make such a model, but it is a reason to tread carefully when looking at how to apply it.

[-] Frank@hexbear.net 4 points 11 months ago

Usually when people do this it's with, like, Mengele or Unit 731s "research", except their research was almost entirely insane sadism with at most a veneer of science. Whereas this is really cut and dry - LLMs can be trained to recognize patterns, you've got ready access to a training dataset (Or the FBI does, or whatever), you can train your LLM to flag and remove CSAM.

[-] LeylaLove@hexbear.net 7 points 11 months ago

Yeah. I'm not justifying people defending Mengele or Unit 731, but there's at least enough there that you can at least understand why people came to this conclusion. Plus, westerners pardoned Unit 731 and similar German scientists for the exact reason of "let's get the data". While they found out the data wasn't useful, the west also had to stick by their decisions to not seem absolutely insane. There has been a propaganda push, along with the west's poorly developed utilitarian view, that makes defending Unit 731 understandable. Defending Unit 731 really isn't that much of a jump from defending capitalists putting workers in many of the same horrific deaths through workplace austerity. If it's okay to cook people to death because you wanted to make a few bucks, what's the issue with freezing someone to death? Not okay mind you, but I can clearly see the route the brainworms took.

This though? Like what do people expect? There is no good reason to not want AI on CP enforcement. Just because "there's a database of CP"? What do people want LE to do with evidence of abuse? I WISH the videos of my abuse were in some Fed's database. Instead my abuser walked off and now I just get the occasional creepy message talking about how hot my abuse was. Tracking pedophiles is pretty much the only consistently good thing the feds do. Why are people criticizing THIS of all things? There are no grounds to make this a real moral debate

[-] WhatDoYouMeanPodcast@hexbear.net 3 points 11 months ago

Only moral quandary that comes to mind is that if you give an AI a bunch of CSAM then it leaks then 1) someone has it 2) AI gets better at generating it. Who's training AI? Who verifies it? Security protocols?

[-] drhead@hexbear.net 7 points 11 months ago

This would be a classifier model, incapable of making images. Most classifier models only output a dictionary of single floating point values for each class it is trained on representing the chance the image belongs to that class. It'd probably be trained by an organization like NCMEC, possibly working with a very well trusted AI firm. For verifying it, usually you reserve a modest representative sample of the database and don't train on it, then use that to determine how accurate the model is and to decide on what score threshold is appropriate for flagging.

this post was submitted on 22 Oct 2023
32 points (100.0% liked)

the_dunk_tank

15898 readers
537 users here now

It's the dunk tank.

This is where you come to post big-brained hot takes by chuds, libs, or even fellow leftists, and tear them to itty-bitty pieces with precision dunkstrikes.

Rule 1: All posts must include links to the subject matter, and no identifying information should be redacted.

Rule 2: If your source is a reactionary website, please use archive.is instead of linking directly.

Rule 3: No sectarianism.

Rule 4: TERF/SWERFs Not Welcome

Rule 5: No ableism of any kind (that includes stuff like libt*rd)

Rule 6: Do not post fellow hexbears.

Rule 7: Do not individually target other instances' admins or moderators.

Rule 8: The subject of a post cannot be low hanging fruit, that is comments/posts made by a private person that have low amount of upvotes/likes/views. Comments/Posts made on other instances that are accessible from hexbear are an exception to this. Posts that do not meet this requirement can be posted to !shitreactionariessay@lemmygrad.ml

Rule 9: if you post ironic rage bait im going to make a personal visit to your house to make sure you never make this mistake again

founded 4 years ago
MODERATORS