Programming

24666 readers

318 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

MaungaHikoi@lemmy.nz

UlrikHD@programming.dev

The lost art of XML — mmagueta (marcosmagueta.com)

submitted 1 day ago by Kissaki@programming.dev to c/programming@programming.dev

93 comments fedilink hide all child comments

There exists a peculiar amnesia in software engineering regarding XML. Mention it in most circles and you will receive knowing smiles, dismissive waves, the sort of patronizing acknowledgment reserved for technologies deemed passé. "Oh, XML," they say, as if the very syllables carry the weight of obsolescence. "We use JSON now. Much cleaner."

you are viewing a single comment's thread
view the rest of the comments

[–] unique_hemp@discuss.tchncs.de 7 points 1 day ago (3 children)

CSV >>> JSON when dealing with large tabular data:

Can be parsed row by row
Does not repeat column names, more complicated (so slower) to parse

1 can be solved with JSONL, but 2 is unavoidable.

[–] entwine@programming.dev 2 points 17 hours ago* (last edited 17 hours ago) (1 children)

{
    "columns": ["id", "name", "age"],
    "rows": [
        [1, "bob", 44], [2, "alice", 7], ...
    ]
}

There ya go, problem solved without the unparseable ambiguity of CSV

Please stop using CSV.

[–] unique_hemp@discuss.tchncs.de 1 points 14 hours ago (1 children)

Great, now read it row by row without keeping it all in memory.

[–] entwine@programming.dev 2 points 13 hours ago

Wdym? That's a parser implementation detail. Even if the parser you're using needs to load the whole file into memory, it's trivial to write your own parser that reads those entries one row at a time. You could even add random access if you get creative.

That's one of the benefits of JSON: it is dead simple to parse.

[–] flying_sheep@lemmy.ml 1 points 22 hours ago

No:

CSV isn't good for anything unless you exactly specify the dialect. CSV is unstandardized, so you can't parse arbitrary CSV files correctly.
you don't have to serialize tables to JSON in the “list of named records” format

Just user Zarr or so for array data. A table with more than 200 rows isn't ”human readable” anyway.

[–] abruptly8951@lemmy.world 1 points 1 day ago (1 children)

Yes..but compression

And with csv you just gotta pray that you're parser parses the same as their writer..and that their writer was correctly implemented..and they set the settings correctly

[–] unique_hemp@discuss.tchncs.de 1 points 23 hours ago (1 children)

Compression adds another layer of complexity for parsing.

JSON can also have configuration mismatch problems. Main one that comes to mind is case (in)sensitivity for keys.

[–] abruptly8951@lemmy.world 3 points 22 hours ago

Nahh your nitpicking there, large csvs are gonna be compressed anyways

In practice I've never met a Json I cant parse, every second csv is unparseable