TechLich

joined 3 years ago
[–] TechLich@lemmy.world 4 points 18 hours ago (1 children)

While this advice is true for all models, when it comes to agentic tasks (add this small feature/write this test harness/find bugs/suggest improvements), open source models are still way behind, vibe code or not.

Claude Fable or even Opus in an editor like Zed have a 1 million token context window and will "think" through the goals of the application, test their changes, work through debugging processes the way a programmer would, stop to ask for clarification, check diagnostic tools and linters, prompt to run test code, etc.

Llama, Gemma and Qwen etc. Do lack a lot of the world knowledge to get the goals of the application, but they also just don't have the debugging skills, won't test their code, don't always tool call correctly, get confused as the context increases and nobody has enough vram to run on large context sizes locally.

They can do autocomplete on small functions but aren't really there for more complex tasks.

On top of that, the biggest problem is that the best open source models are trained and released by the same giant tech conglomerates that have an interest in not competing with their own products. Qwen is Alibaba, Llama is Meta, gpt-oss is OpenAI. Even the more "independent" ones, kimi (Moonshot) and GLM (z.ai) are mostly funded by Alibaba and Tencent. They're released for research and marketing purposes and to please their corporate backers with inflated stock. Almost nobody has the resources to train new models from scratch. People make lots of merges and fine tunes but AI is not democratised the way that traditional programming tools have been.

Maybe some day there will be enough cheap compute for open source communities to pool together resources to build competing models but they're not really there yet :(

[–] TechLich@lemmy.world 1 points 1 day ago

I don't think the "companies making ready-to-drink coffee products at industrial scale" that article is talking about this being designed for are running a 1500 watt appliance for 15 minutes a day.

They're thinking more that factories won't need to use traditional extractors which generally need to heat stuff to high temperatures to make coffee milk drinks and soluble instant coffee, etc.

[–] TechLich@lemmy.world 1 points 1 month ago (1 children)

They don't need to have one.

You can report it here: https://cveform.mitre.org/

Use the CNA-LR since I don't think they have a CNA.

You were probably trying to do the right thing disclosing, just know that there is a better process for it (even if you think the devs are asshats, it's good to do it like that for the community who aren't).

Even if it only affects admins, that includes admins of forks etc.

I'm sure there's probably more vulnerabilities to find.

[–] TechLich@lemmy.world 3 points 1 month ago (1 children)

This, assumes the vendor acts in good faith

Responsible disclosure does not assume the vendor acts in good faith. Usually the disclosure period is around 90 days before the vulnerability is released, fixed or not (although this is negotiable with a good faith vendor).

Forks etc. could have been informed privately first too if possible.

amateurs now have access to tools they should not, and WILL forgo proper standardized communication channels to disclose issues

This is not a good argument. Undisclosed zero days in the wild have always been part of the threat model. Amateurs with LLMs or not, a large percentage of vulnerabilities are not disclosed responsibly and are only fixed after damage has been done. Putting people and their personal information at risk because you want to make a point about the dangers of zero days (which everyone is already aware of) is woefully unethical.

Not everyone is privileged enough to afford security courses, and standardized education.

That doesn't mean we should abandon these things. The vendor can report the CVE too. Or anyone else with an interest in it. It doesn't have to be the untrained amateur grey hat asking Claude for vulns. A malicious threat actor exploiting a system doesn't report it either. The community benefits from skilled people handling things properly. Pretending that it doesn't because most people don't have those skills is silly.

[–] TechLich@lemmy.world 11 points 1 month ago (6 children)

Public disclosure is good, but responsible disclosure usually involves informing the dev first, giving them a period of time to push out a patch and then publicly disclosing for the community to learn from.

[–] TechLich@lemmy.world 9 points 1 month ago

If writing a lot of bash scripts, I really recommend shellcheck. It's a linter for bash that gives a lot of good advice and points out common issues/inefficiencies and errors. There's plugins for most editors or you can just run it in a terminal. I also like that it has good documentation that tells you why something might be wrong or inadvisable.

https://github.com/koalaman/shellcheck

[–] TechLich@lemmy.world 1 points 1 month ago

Yeah. Wikipedia calls it "link aggregation" and the standard is IEEE 802.1AX which also calls it that and the protocol LACP. I think the real reason for so many names is that the standard wasn't developed until later so everyone built their own competing incompatible implementations with different names and it was a mess for years.

Linux implemented it with the Linux bonding driver and switch manufactures made up their own proprietary extensions for it but the standard didn't become a thing until like 2000. Seems like "teaming" is one of the most popular names for it.

[–] TechLich@lemmy.world 0 points 1 month ago (2 children)

Why does this have so many names?

Some stuff calls it bonded, sometimes it's teamed, sometimes LAGed or aggregated or bundled or link channelled or ethertrunked or smartgrouped or Multi-link trunked etc. etc.

[–] TechLich@lemmy.world 2 points 9 months ago (1 children)

I want to know what the 3 minutes of mind blowing entertainment on Mel Croucher's Computer Fun Line was.

[–] TechLich@lemmy.world 14 points 9 months ago

Also "Thou mayest blame" and "Canst thou say"

Hurts my brain a little.

[–] TechLich@lemmy.world 2 points 10 months ago (1 children)

You could do this with logprobs. The language model itself has basically no real insight into its confidence but there's more that you can get out of the model besides just the text.

The problem is that those probabilities are really "how confident are you that this text should come next in this conversation" not "how confident are you that this text is true/accurate." It's a fundamental limitation at the moment I think.

[–] TechLich@lemmy.world 0 points 10 months ago (1 children)

I feel like this isn't quite true and is something I hear a lot of people say about ai. That it's good at following requirements and confirming and being a mechanical and logical robot because that's what computers are like and that's how it is in sci fi.

In reality, it seems like that's what they're worst at. They're great at seeing patterns and creating ideas but terrible at following instructions or staying on task. As soon as something is a bit bigger than they can track context for, they'll get "creative" and if they see a pattern that they can complete, they will, even if it's not correct. I've had copilot start writing poetry in my code because there was a string it could complete.

Get it to make a pretty looking static web page with fancy css where it gets to make all the decisions? It does it fast.

Give it an actual, specific programming task in a full sized application with multiple interconnected pieces and strict requirements? It confidently breaks most of the requirements, and spits out garbage. If it can't hold the entire thing in its context, or if there's a lot of strict rules to follow, it'll struggle and forget what it's doing or why. Like a particularly bad human programmer would.

This is why AI is automating art and music and writing and not more mundane/logical/engineering tasks. Great at being creative and balls at following instructions for more than a few steps.

 

Apparently as a result of terrorism according to Data. Brexit 2 Northern Ireland edition coming soon?

Memory Alpha page

view more: next ›