Technology

2531 readers
38 users here now

Post articles or questions about technology

founded 3 years ago
MODERATORS
1
 
 

Anthropic said it will “abruptly disable” its most advanced AI models for all users after the US government ordered it to suspend access to the models for foreign nationals, citing national security concerns.

The company received the export control directive to suspend access to Fable 5 and Mythos 5 for all foreign nationals, without being given specific details of the national security concern, Anthropic said in a statement.

It is Anthropic’s understanding that the government believes there is a method of bypassing, or “jailbreaking”, a safeguard that would prevent Fable 5 from being used in identifying software vulnerabilities, the company said.

2
 
 

There's a really interesting quirk in modern architecture that a lot of people have been noticing lately referred to as the Curse of Depth in the paper. Basically if you look at popular models like Llama or Qwen or DeepSeek you will find that the deeper layers are surprisingly useless. You can completely prune away huge chunks of the later transformer blocks without actually hurting the performance of the model. The representations in these deep layers end up looking practically identical to each other, and it's a massive waste of GPU hours because we are training billions of parameters that end up doing almost nothing.

The authors trace the root cause directly to Pre-Layer Normalization. Pre-LN makes training massive transformers way more stable than the old Post-LN setups, but the catch is that as you pass data through more and more Pre-LN layers the output variance explodes exponentially. Because of how the math works out this exploding variance forces the derivatives in deep blocks to essentially become an identity matrix turning the layer into a pass-through filter that cannot learn any meaningful new transformations.

And turns out that the problem can be fixed using a remarkably simple tweak called Layer Norm Scaling. They literally just scale the output of the layer norm inversely by the square root of the layer depth. This completely stops the variance from blowing up as you go deeper into the network. Because the variance stays under control the deep layers actually wake up and start contributing to the representation learning.

They tested this trick on models ranging from tiny 130M parameter setups all the way to 7B parameter models. In every case Layer Norm Scaling beat out standard Pre-LN and other normalization tricks. The pre-training loss drops significantly and those gains carry right over into supervised fine-tuning tasks. Best of all it requires zero new hyperparameters or learnable weights. It is just a clean mathematical fix to a fundamental architectural flaw.

3
4
5
6
7
8
9
10
11
 
 

Lemmy.Zip appears to have disappeared/deleted this post, I think moderators on Lemmy.Zip get a bit touchy about hearing that AI is complete bullshit or that they think this somehow doesn't have to do with technology?

....as if an article about how AI works really well wouldn't be on topic in a technology community?

Hot damn I am getting so tired of people who are lost in the AI slop.....

https://lemmy.zip/c/technology?dataType=Post&sort=New

From the Lemmy.zip Technology community sidebar.

If article mentions “AI” in a sentence and then talks about business economics that doesn’t make it tech news.

This is the kind of line that people give who can't admit AI is bullshit and don't want to face that fact so they pretend they are just tired of hearing about AI when what they really don't want is for people to keep pointing out in rational, hard numbers that AI is a bullshit technology.

How in the hell is it not relevant to Technology that what is being promised as the next transformative technology is utterly bullshit? When this bubble crashes it will destroy the US economy and eliminate many tech companies, of course this is relevant to Technology shame on Lemmy.Zip.

12
 
 

Predicting the end of bubbles is impossible, so this one could run on for years. But in my view, this AI bubble should pop. It’s a bad policy choice to focus most of our economic investment in data centers and copyright theft. This strategy is now so important to growth that President Trump is even supporting a moratorium on state regulation of AI, which is very bad idea.

That said, the end of financial bubbles is often dangerous and unpredictable. And I don’t have a lot of confidence the people who run our central banking order will recognize what to do. On the other hand, at least this time Larry Summers won’t be the architect of whatever we end up choosing.

Some advice if you have money in the US stock market you are counting on for how to get as far away from the coming AI bubble crash as possible.

https://www.thisismoney.co.uk/money/pensions/article-15888823/Six-steps-Ive-taken-protect-pension-AI-bubble-ANDREW-OXLADE.html

13
14
 
 

Lemmy.Zip appears to have disappeared/deleted this post, I don't know why.

15
16
17
18
19
20
 
 

A German court has ruled that Google is directly liable for what its AI search overviews say. Previous case law shielding search engine operators from liability doesn't apply to AI overviews.

21
22
23
24
25
 
 

Meta has said it is ⁠filing a federal US ⁠court contempt order against Israeli spyware firm NSO Group for violating a permanent injunction that barred it from ever ⁠targeting WhatsApp and its users.

The company said on Monday that its WhatsApp messaging service disrupted new spear phishing attempts linked to NSO, an ⁠entity blacklisted by the United States government for engaging in activities that are contrary to national security or foreign policy interests.

view more: next ›