Narcissists hate being ignored or called unimportant. Trump flippantly dismissing him as “nuts” and moving on is the ultimate insult.
I’m sure Musk has an army reining him in, but that’s legitimately hard for Musk to ignore.
Narcissists hate being ignored or called unimportant. Trump flippantly dismissing him as “nuts” and moving on is the ultimate insult.
I’m sure Musk has an army reining him in, but that’s legitimately hard for Musk to ignore.
One one more thing, I saw you mention context management.
Mistral (24B) models are really bad at long context, but this is not always the case. I find that Qwen 32B and Gemma 27B are solid at 32K (which is a huge body of text) and (with the right backend settings) you can easily run either at 64K with very minimal vram overhead.
Specifically, run Gemma with the latest llama.cpp server and comment (where it will automatically use sliding window attention as of like yesterday), or Qwen (and most other models) with exllamav2 or exllamav3, which quantizes the kv cache down to Q4 very efficiently.
This way you don’t need to manage context: you can feed the LLM the whole adventure so it doesn’t forget anything, and streaming responses will be instance since it’s always cached.
Oh, one thing about ST specifically: its default sampling presets are catastrophic last I checked. Like, they’re designed for ancient models, and while I have nothing against the UI it is kinda from a different era.
For Gemma and Qwen, I’ve been using like 0.2-0.7 temp, at least 0.05 MinP, 1.01 rep penalty (not something insane like 1.1) and maybe 0.3-ish dry, though like you said dry/xtc can really mess up some tasks.
Also, another suggestion would be to be careful with your sampling. Use a low temperature and high MinP for queries involving rules, higher temperature (+ samplers like DRY) when you're trying to tease out interesting ideas.
I would even suggest an alt front end like mikupad that exposes token probabilities, so you can go to any point in the reply and look through every “idea” the LLM had internally (and regen from that point of you wish”). It’s also good for debugging sampling issues when you have an incorrect answer (as sometimes the LLM gets it right, but bad sampling parameters choose a bad answer).
As long as it supports network inference between machines with heterogeneous cards, it would work for what I have in mind.
It probably doesn’t, heh, especially non Nvidia cards. But the middle layer may work with some generic OpenAI backend like the llama.cpp server.
Yeah, most predatory apps are almost like cheap ripoffs of refined system casinos got down to a science.
Both can be true.
It can be true that the FDA was corrupted/captured to some extent and needs more 'skeptial' and less-industry-friendly leadership. At the same time, skepticism in science is not the answer.
This is my dillema with MAGA. Many of the issues they tackle are spot on, even if people don't like to hear that. They're often right, even when the proposed solutions are wrong and damaging. I think this a lot when I hear RFK speak, nodding my head at the first assertion then grinding my teeth as he goes on.
The budget for marketing has doubled the cost of the entire previous game. Does anyone need ads for GTA6? Wouldn’t just having the devs do livestreams of them playing the game and discussing the tech involved with making GTA6 not create enough hype? Does there even need to be additional hype created?
There is a bit of an "arms race," where other games/entertainment could steal GTA's engagement. Eyeball time is finite, and to quote a paper, "attention is all you need."
You aren't wrong though. Spending so much seems insane when "guerrilla marketing" for such a famous IP would go a long way. I guess part of it is "the line must go up" mentality, where sales must increase dramatically next quarter even if that costs a boatload of marketing to achieve.
Late to the post, but look into SGLang, OP!
In a nutshell, it’s a framework for letting LLMs “fill in blanks” instead of generating entire replies, so you could script in rules as part of the responses as structure for it to grab onto. It’s all locally runnable (with the right hardware, unfortunately).
Also, there are some newer, less sycophantic DM specific models. I can look around if you want.
This hurts as a Texan, but also rings true. I used to think we’re more “independent” minded than the South (as I have some scary Southern family), but every day since 2016 has opened my eyes more.
The murder made me think about how much I heard “fag,” racial slurs and worse as a kid, and how many stayed in the closet out of fear.
If you come, come to Austin! It’s better. Avoid Dallas, it sucks.
Is it a virus that affects the brain?
Yes! It’s called engagement optimization. And the worlds collective ignorance “don’t feed the trolls.”
Musk has quite a “tech bro” following (which we don’t see because we don’t live and breathe on Twitter and such), and that group wields enormous psychological influence over the population.
Seems unlikely, but If Musk aligns himself with Peter Theil, Zuckerberg, maybe Google and such more closely, that’s an extremely dangerous platform for Trump. They can sap power from MAGA (and anyone else) with the flip of a switch.
There’s quite a fundamental incompatibility between tech oligarchs and the red meat MAGA base, too, as is already being exposed. It’s going to get much more stark over the next few years.