brucethemoose

joined 1 year ago
MODERATOR OF
[–] brucethemoose@lemmy.world 12 points 16 hours ago (1 children)

even if it meant publishing on F-Droid instead of Google Play.

Sadly this means it’s not accessible to like 95% of people, even if driven to install it.

[–] brucethemoose@lemmy.world 3 points 17 hours ago* (last edited 17 hours ago) (2 children)

But is anyone who voted for Trump going to see this, and if they do, worry?

I don’t know a good alternative, but it feels like they’re preaching to the choir.

[–] brucethemoose@lemmy.world 10 points 17 hours ago* (last edited 17 hours ago) (1 children)

One thing about Anthropic/OpenAI models is they go off the rails with lots of conversation turns or long contexts. Like when they need to remember a lot of vending machine conversation I guess.

A more objective look: https://arxiv.org/abs/2505.06120v1

https://github.com/NVIDIA/RULER

Gemini is much better. TBH the only models I’ve seen that are half decent at this are:

  • “Alternate attention” models like Gemini, Jamba Large or Falcon H1, depending on the iteration. Some recent versions of Gemini kinda lose this, then get it back.

  • Models finetuned specifically for this, like roleplay models or the Samantha model trained on therapy-style chat.

But most models are overtuned for oneshots like fix this table or write me a function, and don’t invest much in long context performance because it’s not very flashy.

[–] brucethemoose@lemmy.world 14 points 18 hours ago (5 children)

And 500 are about to get fired?

[–] brucethemoose@lemmy.world 27 points 22 hours ago* (last edited 22 hours ago) (1 children)

Most of the US believes in this, or is just unaware. That's how its been for most of history around the world.

...The remarkable issue here is the elites/rules we handed the reigns now drink their own kool-aid. The very top of most authoritarian regimes are at least cognisant of some hypocrisy, even if ideology eats them some.

The other is that people are more 'connected' than ever, but to disinformation streams. I feel like a lot of the world (especially the US fancies) themselves as super smart on shit they know nothing about because of something they saw on Facebook or YouTube.

[–] brucethemoose@lemmy.world 1 points 22 hours ago* (last edited 22 hours ago)

Not at all. Not even close.

Image generation is usually batched and takes seconds, so 700W (a single H100 SXM) for a few seconds for a batch of a few images to multiple users. Maybe more for the absolute biggest (but SFW, no porn) models.

LLM generation takes more VRAM, but is MUCH more compute-light. Typically one has banks of 8 GPUs in multiple servers serving many, many users at once. Even my lowly RTX 3090 can serve 8+ users in parallel with TabbyAPI (and modestly sized model) before becoming more compute bound.

So in a nutshell, imagegen (on an 80GB H100) is probably more like 1/4-1/8 of a video game at once (not 8 at once), and only for a few seconds.

Text generation is similarly efficient, if not more. Responses take longer (many seconds, except on special hardware like Cerebras CS-2s), but it parallelized over dozens of users per GPU.


This is excluding more specialized hardware like Google's TPUs, Huawei NPUs, Cerebras CS-2s and so on. These are clocked far more efficiently than Nvidia/AMD GPUs.


...The worst are probably video generation models. These are extremely compute intense and take a long time (at the moment), so you are burning like a few minutes of gaming time per output.

ollama/sd-web-ui are terrible analogs for all this because they are single user, and relatively unoptimized.

[–] brucethemoose@lemmy.world 0 points 22 hours ago* (last edited 22 hours ago)

TBH most people still use old SDXL finetunes for porn, even with the availability of newer ones.

[–] brucethemoose@lemmy.world 1 points 1 day ago

Also, one other thing is that Nvidia clocks their GPUs (aka the world's AI accelerators) very inefficiently, because they have a pseudo monopoly, and they can.

It doesn't have to be this way, and likely wont in the future.

[–] brucethemoose@lemmy.world 2 points 1 day ago* (last edited 1 day ago)

The UC paper above touches on that. I will link a better one if I find it.

But specifically:

streaming services

Almost all the power from this is from internet infrastructure and the end device. Encoding videos (for them to be played thousands/millions of times) is basically free since its only done once, with the exception being YouTube (which is still very efficient). Storage servers can handle tons of clients (hence they're dirt cheap), and (last I heard) Netflix even uses local cache boxes to shorten the distance.

TBH it must be less per capita than CRTs. Old TVs burned power like crazy.

[–] brucethemoose@lemmy.world 4 points 1 day ago* (last edited 1 day ago)

Bingo.

Altman et al want to kill open source AI for a monopoly.

This is what the entire AI research space already knew even before deepseek hit, and why they (largely) think so little of Sam Altman.

The real battle in the space is not AI vs no AI, but exclusive use by AI Bros vs. open models that bankrupt them. Which is what I keep trying to tell /c/fuck_ai, as the "no AI" stance plays right into the AI Bro's hands.

[–] brucethemoose@lemmy.world 2 points 1 day ago (1 children)

I think that’s going a bit far. ML models are tools to augment people, mostly.

[–] brucethemoose@lemmy.world 13 points 1 day ago* (last edited 1 day ago) (2 children)

Only because of brute force over efficient approaches.

Again, look up Deepseek's FP8/multi GPU training paper, and some of the code they published. They used a microscopic fraction of what OpenAI or X AI are using.

And models like SDXL or Flux are not that expensive to train.

It doesn’t have to be this way, but they can get away with it because being rich covers up internal dysfunction/isolation/whatever. Chinese trainers, and other GPU constrained ones, are forced to be thrifty.

 

As to why it (IMO) qualifies:

"My children are 22, 25, and 27. I will literally fight ANYONE for their future," Greene wrote. "And their future and their entire generation's future MUST be free of America LAST foreign wars that provoke terrorists attacks on our homeland, military drafts, and NUCLEAR WAR."

Hence, she feels her support is threatening her kids.

"MTG getting her face eaten" was not on my 2025 bingo card, though she is in the early stage of face eating.

 

"It's not politically correct to use the term, 'Regime Change' but if the current Iranian Regime is unable to MAKE IRAN GREAT AGAIN, why wouldn't there be a Regime change??? MIGA!!

 

Video is linked. SFW, but keep your volume down.

 

In a nutshell, he’s allegedly frustrated by too few policies favorable to him.

 
  • The IDF is planning to displace close to 2 million Palestinians to the Rafah area, where compounds for the delivery of humanitarian aid are being built.
  • The compounds are to be managed by a new international foundation and private U.S. companies, though it's unclear how the plan will function after the UN and all aid organizations announced they won't take part
 

Qwen3 was apparently posted early, then quickly pulled from HuggingFace and Modelscope. The large ones are MoEs, per screenshots from Reddit:

screenshots

Including a 235B/22B active and a 30B/3B active.

Context appears to 'only' be 32K unfortunately: https://huggingface.co/qingy2024/Qwen3-0.6B/blob/main/config_4b.json

But its possible they're still training them to 256K:

from reddit

Take it all with a grain of salt, configs could change with the official release, but it appears it is happening today.

 

This is one of the "smartest" models you can fit on a 24GB GPU now, with no offloading and very little quantization loss. It feels big and insightful, like a better (albeit dry) Llama 3.3 70B with thinking, and with more STEM world knowledge than QwQ 32B, but comfortably fits thanks the new exl3 quantization!

Quantization Loss

You need to use a backend that support exl3, like (at the moment) text-gen-web-ui or (soon) TabbyAPI.

 

"It makes me think that maybe he [Putin] doesn't want to stop the war, he's just tapping me along, and has to be dealt with differently, through 'Banking' or 'Secondary Sanctions?' Too many people are dying!!!", Trump wrote.

 

The U.S. expects Ukraine's response Wednesday to a peace framework that includes U.S. recognition of Crimea as part of Russia and unofficial recognition of Russian control of nearly all areas occupied since the 2022 invasion, sources with direct knowledge of the proposal tell Axios.

What Russia gets under Trump's proposal:

  • "De jure" U.S. recognition of Russian control in Crimea.
  • "De-facto recognition" of the Russia's occupation of nearly all of Luhansk oblast and the occupied portions of Donetsk, Kherson and Zaporizhzhia.
  • A promise that Ukraine will not become a member of NATO. The text notes that Ukraine could become part of the European Union.
  • The lifting of sanctions imposed since 2014.
  • Enhanced economic cooperation with the U.S., particularly in the energy and industrial sectors.

What Ukraine gets under Trump's proposal:

  • "A robust security guarantee" involving an ad hoc group of European countries and potentially also like-minded non-European countries. The document is vague in terms of how this peacekeeping operation would function and does not mention any U.S. participation.
  • The return of the small part of Kharkiv oblast Russia has occupied.
  • Unimpeded passage of the Dnieper River, which runs along the front line in parts of southern Ukraine.
  • Compensation and assistance for rebuilding, though the document does not say where the funding will come from.

Whole article is worth a read, as it’s quite short/dense as Axios usually is. For those outside the US, this is an outlet that’s been well sourced in Washington for years.

 

Seems there's not a lot of talk about relatively unknown finetunes these days, so I'll start posting more!

Openbuddy's been on my radar, but this one is very interesting: QwQ 32B, post-trained on openbuddy's dataset, apparently with QAT applied (though it's kinda unclear) and context-extended. Observations:

  • Quantized with exllamav2, it seems to show lower distortion levels than nomal QwQ. Its works conspicuously well at 4.0bpw and 3.5bpw.

  • Seems good at long context. Have not tested 200K, but it's quite excellent in the 64K range.

  • Works fine in English.

  • The chat template is funky. It seems to mix up the and <|think|> tags in particular (why don't they just use ChatML?), and needs some wrangling with your own template.

  • Seems smart, can't say if it's better or worse than QwQ yet, other than it doesn't seem to "suffer" below 3.75bpw like QwQ does.

Also, I reposted this from /r/locallama, as I feel the community generally should going forward. With its spirit, it seems like we should be on Lemmy instead?

 

So I had a clip I wanted to upload to a lemmy comment:

  • Tried it as an (avc) mp4... Failed.
  • OK, too big? I shrink it to 2MB, then 1MB. Failed.
  • VP9 Webm maybe? 2MB, 1MB, failed. AV1? Failed.
  • OK, fine, no video. Lets try an animated AVIF. Failed. It seems lemmy doesn't even take static AVIF images
  • WebP animation then... Failed. Animated PNG, failed.

End result, I have to burden the server with a massive, crappy looking GIF after trying a dozen formats. With all due respect, this is worse than some aging service like Reddit that doesn't support new media formats.

For reference, I'm using the web interface. Is this just a format restriction of lemmy.world, or an underlying software support issue?

 

53% of Americans approve of Trump so far, according to a newly released CBS News/YouGov poll conducted Feb. 5 to 7, while 47% disapproved.

A large majority, 70%, said he was doing what he promised in the campaign, per the poll that was released on Sunday.

Yes, but: 66% said he was not focusing enough on lowering prices, a key campaign trail promise that propelled Trump to the White House.

44% of Republicans said Musk and DOGE should have "some" influence, while just 13% of Democrats agreed.

view more: next ›