[-] imadabouzu@awful.systems 0 points 1 month ago

I am not a lawyer. But you wouldn't be surprised to hear that

  1. I don't have inside story of Bing in Germany. It could be that Microsoft either doesn't want to do it well, or hasn't yet done it well enough. I'm not promising either in particular, but it can be done.
  2. Generally as an engineer you have a pile of options with trade offs. You absolutely can build nuanced solutions, as often the law and the lawyers live in nuanced realities. That is the reality of even the best sorts of tech companies who are trying.

My commitment is that maximalism or strict binary assumptions won't work on either end and don't satisfy what anyone truly wants or needs. If we're not careful about what it takes to move the needle, we agree with them by saying 'it can't be done, so it wont be done.'

[-] imadabouzu@awful.systems 2 points 1 month ago* (last edited 1 month ago)

That's a good question, because there is nuance here! It's interesting because while working on similar projects I also ran into this issue. First off, it's important to understand what your obligation is and the way that you can understand data deletion. No one believes it is necessary to permanently remove all copies of anything, anymore than it is necessary to prevent all forms of plagairism. No one is complaining that is possible at all to plaigarise, we're complaining that major institutions are continuing to do so with ongoing disregard of the law.

Only maximalists fall into the trap that thinking of the world in binary sense: either all in or do nothing at all.

For most of us, it's about economics and risk profiles. Open source models get trained continuously over time, there won't be one version. Saying that open source operators do have some obligations to in good faith to curate future training to comply has a long tail impact on how that model evolves. Previous PII or plaigarized data might still exist, but its value and novelty and relevance to economic life goes down sharply over time. No artist or writer argues that copyright protections need to exist forever. They literally, just need to have survival working conditions, and the respect for attribution. The same thing with PII: no one claims that they must be completely anonymous. They just desire cyber crime to be taken seriously rather than abandoned in favor of one party taking the spoils of their personhood.

Also, yes, there are algorithms that can control how further learning promotes or demotes growth and connections relative to various policies. Rather than saying that any one policy is perfect, a mere willingness to adopt policies in good faith (most such LLM filters are intentionally weak so that those with $$ and paying for API access can outright ignore them, while they can turn around and claim it can't be solved too bad so sad).

Yes. It is possible to perturb and influence the evolution of a continuously trained neural network based on external policy, and they're carefully lying through omision when they say they can't 100% control it or 100% remove things. Fine. That's, not necessary, neither in copyright nor privacy law. Never been.

[-] imadabouzu@awful.systems 3 points 2 months ago

NovelAI

I'll step up and say, I think this is fine, and I support your use. I get it. I think that there are valid use cases for AI where the unethical labor practices become unnecessary, and where ultimately the work still starts and ends with you.

In a world, maybe not too far in the future, where copyright law is strengthened, where artist and writer consent is respected, and it becomes cheap and easy to use a smaller model trained on licensed data and your own inputs, I can definitely see how a contextual autocomplete that follows your style and makes suggestions is totally useful and ethical.

But i understand people's visceral reaction to the current world. I'd say, it's ok to stay your course.

[-] imadabouzu@awful.systems -1 points 2 months ago

Maybe hot take, but when I see young people (recent graduation) doing questionable things in pursuit of attention and a career, I cut them some slack.

Like it's hard for me to be critical for someone starting off making it in, um, gestures about this, world today. Besides, they'll get the sense knocked into them through pain and tears soon enough.

I don't find it strange or malice, I find it as symptom of why it was easier for us to find honest work then, and harder for them now.

[-] imadabouzu@awful.systems 1 points 3 months ago

Fwiw, this is also why I -do- think it's important to talk more frankly about where science is moving towards ala things like FEP or scale free dynamics. An alternative story on things like what energy, computation, and participation really means, is useful, not for prescribing the future, but the opposite: putting ambiguity and the importance of participation back in it.

The current world view, that some how things are cleanly separated and in nice little ontological boxes of capability and shape and form, lead to closed systems delusions. It's fragile and we know it, I hope. Von Neuman's "last invention" is wrong because most, unfortunately, most "smart people's" view of intelligence has become reductive in liu of a bigger picture.

In addition to our sneers, we should want to tell a more robust story about all of these things.

[-] imadabouzu@awful.systems 3 points 3 months ago

We might as well call the moon a computer since it is ‘calculating’ the effect of a gravitational field on a moon sized object.

Yes. In fact, that's sort of my point. There is no privileged sense of computation. They can be different even if they do, have invariants.

But as far as all the other process a brain does (breathing/maintaining heart rate/etc.) describing that as ‘a computer’ seems such an abuse of notation as to render the original definition meaningless.

I tend to agree that often times, the terminology of 'attendance' is better than the terminology of computation, but I don't think that there isn't -any- meaning in keeping the computer metaphor, because I do think it has practical implications.

At the risk of going down another rabbit hole, I'd really say that the Free Energy Principle does a pretty good job of showing why keeping a wide, but nonetheless useful, definition of computation on the table can, be useful. As in, a principled tool that can shed some light on scale free dynamics (and not in a absolute, definitive answer to all questions).

https://www.youtube.com/watch?v=KQk0AHu_nng

Maybe another reason I'm ok with the computer metaphor (in which we retain the lack of privelege, and in which the attendance metaphor is kept), is that it does sort provide us some interesting technical intuitions, too. Like, how the maximum power principle effects the design and building of technology of all kinds (whether it's chemistry, electronics, energy, gardening) , how ambiguity (that is, the unknowable embedded environment) is an important functional element of deploying any sort of technology (or policy, or behavior), and how, yeah.

One day, the fact that simple and even slow things (like water, or the moon, or chemicals, or rocks, or animals) are capable computationally, but attend to different things, is in fact. Going to be meaningful and important.

[-] imadabouzu@awful.systems 3 points 4 months ago

It absolutely is effective -- but there's economics at play. You can't 100% close the whole on anything. Scrappers can themselves employee expensive techniques to try to sort or clean content pre-training.

But altering the economics is meaningful, even if it won't give you strong guarantees. Big, maximalist systems fall from a million paper cuts. They live or die on the economics of the smaller parts.

[-] imadabouzu@awful.systems 3 points 4 months ago

Maybe hot take, but I actually feel like the world doesn't need strictly speaking more documentation tooling at all, LLM / RAG or otherwise.

Companies probably actually need to curate down their documents so that simpler thinks work, then it doesn't cost ever increasing infrastructure to overcome the problems that previous investment actually literally caused.

[-] imadabouzu@awful.systems 2 points 4 months ago

I don’t know exactly what you think I want.

I don't know precisely what, you want, and I never will. It was practical advice about writing grounded in an analogy, mostly because they are two things I like. If it's not helpful, you are free to not, internalize it.

Getting the attention (even maladaptively) may make some progress towards solving my problem.

Ok.

[-] imadabouzu@awful.systems 3 points 4 months ago

Maybe another way of thinking about "being economical" is thinking of writing as a relationship. Half the work is yours. But half of the work is, the audience.

I just hope that whatever you do, you find peace and a bit of fun of it. And broadly that means letting go of some things so that you can focus on others.

view more: ‹ prev next ›

imadabouzu

joined 4 months ago