[-] General_Effort@lemmy.world 1 points 4 days ago

he more I learn about neural networks, the more they seem like very convoluted statistics

How so?

[-] General_Effort@lemmy.world 3 points 4 days ago

That's where the almost comes in. Unfortunately, there are many traps for the unwary stochastic parrot.

Training a neural net can be seen as a generalized regression analysis. But that's not where it comes from. Inspiration comes mainly from biology, and also from physics. It's not a result of developing better statistics. Training algorithms, like Backprop, were developed for the purpose. It's not something that the pioneers could look up in a stats textbook. This is why the terminology is different. Where the same terms are used, they don't mean quite the same thing, unfortunately.

Many developments crucial for LLMs have no counterpart in statistics, like fine-tuning, RLHF, or self-attention. Conversely, what you typically want from a regression - such as neatly interpretable parameters with error bars - is conspicuously absent in ANNs.

Any ideas you have formed about LLMs, based on the understanding that they are just statistics, are very likely wrong.

[-] General_Effort@lemmy.world -1 points 5 days ago

Neural nets, including LLMs, have almost nothing to do with statistics. There are many different methods in Machine Learning. Many of them are applied statistics, but neural nets are not. If you have any ideas about how statistics are at the bottom of LLMs, you are probably thinking about some other ML technique. One that has nothing to do with LLMs.

[-] General_Effort@lemmy.world 7 points 5 days ago

The noun doesn't matter after an adjective like 'multiple.' Nothing good ever follows 'multiple.'

-Terry Pratchett, Guards! Guards!

[-] General_Effort@lemmy.world 20 points 6 days ago

Not like this is the first time. It's not even the second time.

[-] General_Effort@lemmy.world 66 points 4 months ago

The FTC is worried that the big tech firms will further entrench their monopolies. They are doing a lot of good stuff lately; an underappreciated boon of the Biden Presidency. Lina Khan looks to be really set on fixing decades of mistakes.

I guess they just want to know if these deals lock out potential competitors.

[-] General_Effort@lemmy.world 60 points 4 months ago

The wars of the future will not be fought on the battlefield or at sea. They will be fought in space, or possibly on top of a very tall mountain. In either case, most of the actual fighting will be done by small robots. And as you go forth today remember always your duty is clear: To build and maintain those robots.

[-] General_Effort@lemmy.world 77 points 4 months ago

Despite the fact that Nvidia is now almost the main beneficiary of the growing interest in AI, the head of the company, Jensen Huang, does not believe that additional trillions of dollars need to be invested in the industry.

*Because of

You heard it, guys. There's no need to create competition to Nvidia's chips. It's perfectly fine if all the profits go to Nvidia, says Nvidia's CEO.

[-] General_Effort@lemmy.world 51 points 5 months ago

Currently, AI means Artificial Neural Network (ANN). That's only one specific approach. What ANN boils down to is one huge system of equations.

The file stores the parameters of these equations. It's what's called a matrix in math. A parameter is simply a number by which something is multiplied. Colloquially, such a file of parameters is called an AI model.

2 GB is probably an AI model with 1 billion parameters with 16 bit precision. Precision is how many digits you have. The more digits you have, the more precise you can give a value.

When people talk about training an AI, they mean finding the right parameters, so that the equations compute the right thing. The bigger the model, the smarter it can be.

Does that answer the question? It's probably missing a lot.

[-] General_Effort@lemmy.world 58 points 5 months ago

Explanation of how this works.

These "AI models" (meaning the free and open Stable Diffusion in particular) consist of different parts. The important parts here are the VAE and the actual "image maker" (U-Net).

A VAE (Variational AutoEncoder) is a kind of AI that can be used to compress data. In image generators, a VAE is used to compress the images. The actual image AI only works on the smaller, compressed image (the latent representation), which means it takes a less powerful computer (and uses less energy). It’s that which makes it possible to run Stable Diffusion at home.

This attack targets the VAE. The image is altered so that the latent representation is that of a very different image, but still roughly the same to humans. Say, you take images of a cat and of a dog. You put both of them through the VAE to get the latent representation. Now you alter the image of the cat until its latent representation is similar to that of the dog. You alter it only in small ways and use methods to check that it still looks similar for humans. So, what the actual image maker AI "sees" is very different from the image the human sees.

Obviously, this only works if you have access to the VAE used by the image generator. So, it only works against open source AI; basically only Stable Diffusion at this point. Companies that use a closed source VAE cannot be attacked in this way.


I guess it makes sense if your ideology is that information must be owned and everything should make money for someone. I guess some people see cyberpunk dystopia as a desirable future. I wonder if it bothers them that all the tools they used are free (EG the method to check if images are similar to humans).

It doesn’t seem to be a very effective attack but it may have some long-term PR effect. Training an AI costs a fair amount of money. People who give that away for free probably still have some ulterior motive, such as being liked. If instead you get the full hate of a few anarcho-capitalists that threaten digital vandalism, you may be deterred. Well, my two cents.

view more: ‹ prev next ›

General_Effort

joined 6 months ago