Programming

25401 readers

704 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

UlrikHD@programming.dev

bugsmith@programming.dev

Spyro@programming.dev

We mourn our craft (nolanlawson.com)

submitted 1 day ago by codeinabox@programming.dev to c/programming@programming.dev

41 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Glitchvid@lemmy.world 4 points 20 hours ago (3 children)

Because this comes up so often, I have to ask, specifically what kind of boilerplate? Examples would be great.

[–] Meron35@lemmy.world 1 points 6 hours ago

IIRC there were some polls for how helpful LLMs were by language/professions, and data science languages/workflows consistently rated LLMs very highly. Which makes sense, because the main steps of 1) data cleaning, 2) estimation and 3) presenting results all have lots of boilerplate.

Data cleaning really just revolves around a few core functions such as filter, select, and join; joins in particular can get very complicated to keep track of for big data.

For estimation, the more complicated models all require lots of hyperparameters, all of which need to be set up (instantiated if you use an OOP implementation like Python) and looped over some validation set. Even with dedicated high level libraries like scikit, there is still a lot of boilerplate.

Presentation usually consists of visualisation and cleaning up results for tables. Professional visualisations require titles, axis labels, reformatted axis labels etc, which is 4-5 lines of boilerplate minimum. Tables are usually catted out to HTML or LaTeX, both of which are notorious for boilerplate. This isn't even getting into fancier frontends/dashboards, which is its own can of worms.

The fact that these steps tend to be quite bespoke for every dataset also means that they couldn't be easily automated by existing autocomplete, e.g. formatting SYS_BP to "Systolic Blood Pressure (mmHg)" for the graphs/tables.

[–] astronaut_sloth@mander.xyz 1 points 13 hours ago

Totally fair question. One of my go-to examples is for a lot of data visualization stuff, just having an LLM spit out basic graphs with the parameters in the function call. Same with mock-ups of basic user interfaces. I'm not a front-end person at all, and I usually want something basic and routine (but still time consuming), like CRUD or something, so just prompting for that and getting a reasonably decent product is a helpful time saver.

For anything more than basic stuff, I don't think I've ever gotten more than a single small function that I then verify line by line.

[–] Feyd@programming.dev 2 points 17 hours ago (1 children)

I always wonder this as well... I will use tools to help me write some repetitive stuff periodically. Most often I'll use a regex replace but occasionally I'll write a little perl or sed or awk. I suspect the boilerplate these people talk about are either this it setting up projects, which I think there are also better tools for

[–] Glitchvid@lemmy.world 2 points 17 hours ago

My experience as well.

I've been writing Java lately (not my choice), which has boilerplate, but it's never been an issue for me because the Java IDEs all have tools (and have for a decade+) that eliminate it. Class generation, main, method stubs, default implementations, and interface stubs can all be done in, for example: Eclipse, easily.

Same for tooling around (de)serialization and class/struct definitions, I see that being touted as a use case for LLMs; but like... tools have existed^[e.g. https://transform.tools/json-to-java] for doing that before LLMs, and they're deterministic, and are computationally free compared to neural nets.