Avicenna

joined 1 month ago
[–] Avicenna@programming.dev 20 points 2 weeks ago

lol this is golden

[–] Avicenna@programming.dev 2 points 2 weeks ago* (last edited 2 weeks ago)

I suspect it mostly relates how much code base there is on internet about the topic. For instance if you make it use a niche library, it is quite common that it makes up methods that don't exist in that library but exists in related libraries. When I point this out, it also hallucinates saying "It was removed after version bla". I also may not be using the most cutting edge LLM (mix of freely available and open source ones).

The other day I asked it whether if there is a python library that can do linear algebra over F2, for which it pointed me to the correct direction (Galois) but when I asked it examples of how to do certain stuff it just came up with wrong functions over and over again:

In the end it probably was still faster than google searching this but all of these errors happened one after the other in the span of five minutes, so yeah. If I recall correctly, some of its claims about these namespaces, versions etc were also hallucinated. For instance vstack also does not exist in Galois but it does exist in a very popular package called numpy that can do regular linear algebra (and which this package also uses behind the scenes).

[–] Avicenna@programming.dev 4 points 2 weeks ago* (last edited 2 weeks ago)

In my case it does hallucinate regularly. It makes up functions that don't exist in that library but exists in similar libraries. So the end result is useful as a keyword though the code is not. My favourite part is if you point out that the function does not exists the answer is ALWAYS "I am sorry you are right, since version bla of this library this function no longer exists" whereas in reality it had never existed in that library at all. For me the best use case for LLMs is as a search engine and that is because of the shitty state most current search engines are in.

Maybe LLMs can be fine tuned to do the grinding aspects of coding (like boiler plates for test suites etc), with human supervision. But this will many times end up being a situation where junior coders are fired/no longer hired and senior coders are expected to baby sit LLMs to do those jobs. This is not entirely different from supervising junior coders except it is probably more soul destroying. But the biggest flaw in this design is it assumes LLMs one day will be good enough to do senior coding tasks so that when senior coders also retire*, LLMs take their place. If this LLM breakthrough is never realized and this trend of keeping low number of junior coders sticks, we will likey have a programmer crisis in future.

*: I say retire but for many CEOs, it is their wet dream to be able to let go all coders and have LLMs do all the tasks

[–] Avicenna@programming.dev 3 points 2 weeks ago

Golden age of internet

[–] Avicenna@programming.dev 3 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

this is my nightmare. also looks like a design problem did they not compensate you?

[–] Avicenna@programming.dev 58 points 2 weeks ago* (last edited 2 weeks ago) (5 children)

Uh no, this is not factual. CEOs decide between a 70ft yacht and a 100ft yacht. Get your facts straight.

[–] Avicenna@programming.dev 7 points 2 weeks ago (1 children)

Yea even a college kid could write a romance novel with a sexy billionaire vampire. Try writing it with this instead:

[–] Avicenna@programming.dev 5 points 2 weeks ago

this should be in not the onion if real

[–] Avicenna@programming.dev 24 points 2 weeks ago (3 children)

Not surprising, about %50 of democrats are impostors. They would easily switch to republican party if they riskrd losing their seats.

[–] Avicenna@programming.dev 4 points 2 weeks ago

Putin never bets on just one horse

[–] Avicenna@programming.dev 1 points 2 weeks ago* (last edited 2 weeks ago)

Based on the star distribution today (everyone who has solved part 1 also solved part2) I suspect at least part 2 is easy even if I can't make a dent on part1 yet

view more: ‹ prev next ›