this post was submitted on 11 Apr 2024

166 points (95.1% liked)

Linux

56862 readers

618 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
No misinformation
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 6 years ago

MODERATORS

AgreeableLandscape@lemmy.ml

nooter692@lemmy.ml

MarcellusDrum@lemmy.ml

cypherpunks@lemmy.ml

cyclohexane@lemmy.ml

d3Xt3r@lemmy.nz

166

Are there any things in Linux that need to be started over from scratch? (lemmy.ca)

submitted 1 year ago by sepulcher@lemmy.ca to c/linux@lemmy.ml

266 comments fedilink hide all child comments

I'm curious how software can be created and evolve over time. I'm afraid that at some point, we'll realize there are issues with the software we're using that can only be remedied by massive changes or a complete rewrite.

Are there any instances of this happening? Where something is designed with a flaw that doesn't get realized until much later, necessitating scrapping the whole thing and starting from scratch?

you are viewing a single comment's thread
view the rest of the comments

[–] taladar@sh.itjust.works 15 points 1 year ago (3 children)

I would say the whole set of C based assumptions underlying most modern software, specifically errors being just an integer constant that is translated into a text so it has no details about the operation tried (who tried to do what to which object and why did that fail).

[–] teawrecks@sopuli.xyz 8 points 1 year ago (2 children)

You mean 0 indicating success and any other value indicating some arbitrary meaning? I don't see any problem with that.

Passing around extra error handling info for the worst case isn't free, and the worst case doesn't happen 99.999% of the time. No reason to spend extra cycles and memory hurting performance just to make debugging easier. That's what debug/instrumented builds are for.

[–] taladar@sh.itjust.works 3 points 1 year ago (1 children)

Passing around extra error handling info for the worst case isn’t free, and the worst case doesn’t happen 99.999% of the time.

The case "I want to know why this error happened" is basically 100% of the time when an error actually happens.

And the case of "Permission denied" or similar useless nonsense without any details costing me hours of my life in debugging time that wouldn't be necessary if it just told me permission for who to do what to which object happens quite regularly.

[–] teawrecks@sopuli.xyz -1 points 1 year ago (1 children)

"0.001% of the time, I wanna know every time 👉😎👉"

Yeah, I get that. But are we talking about during development (which is why we're choosing between C and something else)? In that case, you should be running instrumented builds, or with debug functionality enabled. I agree that most programs just fail and don't tell you how to go about enabling debug info or anything, and that could be improved.

For the "Permission Denied" example, I also assume we're making system calls and having them fail? In that case it seems straight forward: the user you're running as can't access the resource you were actively trying to access. But if we're talking about some random log file just saying "Error: permission denied" and leaving you nothing to go on, that's on the program dumping the error to produce more useful information.

In general, you often don't want to leak more info than just Worked or Didn't Work for security reasons. Or a mix of security/performance reasons (possible DOS attacks).

[–] taladar@sh.itjust.works 0 points 1 year ago (2 children)

During development is just about the only time when that doesn't matter because you have direct access to the source code to figure out which function failed exactly. As a sysadmin I don't have the luxury of reproducing every issue with a debug build with some debugger running and/or print statements added to figure out where exactly that value originally came from. I really need to know why it failed the first time around.

[–] teawrecks@sopuli.xyz 1 points 1 year ago (1 children)

Yeah, so it sounds like your complaint is actually with application not propagating relevant error handling information to where it's most convenient for you to read it. Linux is not at fault in your example, because as you said, it returns all the information needed to fix the issue to the one who developed the code, and then they just dropped the ball.

Maybe there's a flag you can set to dump those kinds of errors to a log? But even then, some apps use the fail case as part of normal operation (try to open a file, if we can't, do this other thing). You wouldn't actually want to know about every single failure, just the ones that the application considers fatal.

As long as you're running on a turing complete machine, it's on the app itself to sufficiently document what qualifies as an error and why it happened.

[–] taladar@sh.itjust.works 1 points 1 year ago

The whole point of my complaint is that shitty C conventions produce shitty error messages. If I could rely on the programmer to work around those stupid conventions every time by actually checking the error and then enriching it with all relevant information I would have no complaints.

[–] uis@lemm.ee 0 points 1 year ago (1 children)

As sysadmin you should know about strace

[–] taladar@sh.itjust.works 0 points 1 year ago (1 children)

I know about strace, strace still requires me to reproduce the issue and then to look at backtraces if nobody bothered to include any detail in the error.

[–] uis@lemm.ee 0 points 1 year ago

Somehow (lack of) backtrace and details in error is "C based assumption"

[–] atzanteol@sh.itjust.works 2 points 1 year ago (2 children)

Ugh, I do not miss C...

Errors and return values are, and should be, different things. Almost every other language figured this out and handles it better than C.

[–] teawrecks@sopuli.xyz 4 points 1 year ago (1 children)

It's more of an ABI thing though, C just doesn't have error handling.

And if you do exception handling wrong in most other languages, you hamstring your performance.

[–] taladar@sh.itjust.works 3 points 1 year ago

The unofficial C motto "Make it fast, who gives a shit about correctness"

[–] uis@lemm.ee 1 points 1 year ago

Errors and return values are, and should be, different things.

That's why errno and return value are different things.

[–] smileyhead@discuss.tchncs.de 8 points 1 year ago (2 children)

You have stderr to throw errors into. And the constants are just error codes, like HTTP error codes. Without it how computer would know if the program executed correctly.

[–] taladar@sh.itjust.works 2 points 1 year ago

stderr is useless if the syscall already returns a single integer only because of stupid C conventions.

[–] atzanteol@sh.itjust.works 0 points 1 year ago (2 children)

You throw an exception like a gentleman. But C doesn't support them. So you need to abuse the return type to also indicate "success" as well as a potential value the caller wanted.

[–] 0x0@programming.dev 3 points 1 year ago* (last edited 1 year ago) (2 children)

Exceptionss are bad coding, and what's abusive of using the full range of an integer? 0 success, everything else, error - check the API for details or call strerror.

[–] taladar@sh.itjust.works 1 points 1 year ago (1 children)

Returning error codes in-band is the reason for a significant percentage of C bugs and security holes when the return value is used without checking. Something like Rust's Result type that forces you to distinguish the two cases is much better design here. And no, you are not working with a whole language ecosystem of "sufficiently disciplined programmers" so that nobody ever forgets to check a return value.

Not to mention that errno is just a very broken design in the times of modern thread and event systems, signals, interrupts and all kinds of other ways to produce race conditions and overwrite the errno value before it is checked.

[–] uis@lemm.ee 1 points 1 year ago

errno is not shared between threads. Also:

signal handlers that call functions that may set errno or modify the floating-point environment must save their original values, and restore them before returning.

There does not add more race conditions because signal handlers execute in one of regular threads. In single-threaded program signals are functions that can be called by OS at any point of execution, but they do not execute at same time with threads.

[–] atzanteol@sh.itjust.works 1 points 1 year ago

errno is bad programming.

[–] uis@lemm.ee 3 points 1 year ago* (last edited 1 year ago)

So you need to abuse the return type to also indicate "success" as well as a potential value the caller wanted.

You don't need to.

Returnung structs, returning by pointer, signals, error flags, setjmp/longjmp, using cxa for exceptions(lol, now THIS is real abuse).

[–] uis@lemm.ee 1 points 1 year ago (1 children)

Assembly doesn't have concept of objects.

[–] taladar@sh.itjust.works 1 points 1 year ago (1 children)

It does very much have the concept of objects as in subject, verb, object of operations implemented in assembly.

As in who (user foo) tried to do what (open/read/write/delete/...) to which object (e.g. which socket, which file, which Linux namespace, which memory mapping,...).

[–] uis@lemm.ee 1 points 1 year ago* (last edited 1 year ago) (1 children)

implemented in assembly.

Indeed. Assembly is(can be) used to implement them.

As in who (user foo) tried to do what (open/read/write/delete/...) to which object (e.g. which socket, which file, which Linux namespace, which memory mapping,...).

Kernel implements it in software(except memory mappings, it is implemented in MMU). There are no sockets, files and namespaces in ISA.

[–] taladar@sh.itjust.works 1 points 1 year ago (1 children)

You were the one who brought up assembly.

And stop acting like you don't know what I am talking about. Syscalls implement operations that are called by someone who has certain permissions and operate on various kinds of objects. Nobody who wants to debug why that call returned "Permission denied" or "File does not exist" without any detail cares that there is hardware several layers of abstraction deeper down that doesn't know anything about those concepts. Nothing in the hardware forces people to make APIs with bad error reporting.

[–] uis@lemm.ee 1 points 1 year ago (1 children)

And why "Permission denied" is bad reporting?

[–] taladar@sh.itjust.works 1 points 1 year ago (1 children)

Because if a program dies and just prints strerror(errno) it just gives me "Permission denied" without any detail on which operation had permissions denied to do what. So basically I have not enough information to fix the issue or in many cases even to reproduce it.

[–] uis@lemm.ee 0 points 1 year ago* (last edited 1 year ago)

It may just not print anything at all. This is logging issue, not "C based assumption". I wouldn't be surprised if you will call "403 Forbidden" a "C based assumtion" too.

But since we are talking about local program, competent sysadmin can strace program. It will print arguments and error codes.