222
submitted 10 months ago* (last edited 10 months ago) by empireOfLove@lemmy.one to c/datahoarder@lemmy.ml

a TorrentFreak article got me spooked so I fired up the ol' yt-dlp. Got the entire channel, including comments, description metadata, and thumbnail images.

A significant number of videos were actually unavailable because of an odd YouTube bug where 15+ year old videos were listed as "currently being processed". I may re-run this later (since I ran it in archive file mode) to get the missing videos, as it seems there may be about 300 out of 4911 videos missing.

all 31 comments
sorted by: hot top controversial new old
[-] mtcerio@lemmy.world 16 points 10 months ago

Gives an idea of the amount of data YouTube is storing, if only this one channel is 250GB!

[-] people_are_cute 4 points 10 months ago* (last edited 10 months ago)

And that 250GB is probably just the downloaded and HEVC-compressed files. YouTube actually promotes uploading in raw formats for best quality, just 3-4 full-length movies would be enough to fill 250GB for them

[-] empireOfLove@lemmy.one 3 points 10 months ago* (last edited 10 months ago)

And mind you, they have a high number of videos but most are short clips and all of them are low res, 360p or 480p max. Any other channel uploading HD or 4k content will be orders of magnitudes larger for fewer videos.

[-] notasandwich1948@sh.itjust.works 3 points 10 months ago

makes me wonder how the whole thing is sustainable for them, on average it seems about 6gb per 100 videos

[-] socks@kbin.social 1 points 8 months ago

iirc youtube is right now a net loss for google, hence them constantly trying to stuff it with ads, youtube premium, etc.

[-] MossyFeathers@pawb.social 16 points 10 months ago

Nice! What're you gonna do with them? Are you gonna upload them somewhere, or just hold onto them?

[-] empireOfLove@lemmy.one 16 points 10 months ago

They still happily exist on YouTube- for now. So no point in re-hosting, they'll get squirreled away into the Giant Hard Drive of Doom.

If something happens to the actual archive project in the near future, I'll likely section them up into 20gb pieces and post them out on a torrent someplace.

[-] Appoxo@lemmy.dbzer0.com 5 points 10 months ago

Just upload it to archive.org before your backup dies. No need to hoard it for yourself.

[-] raoulraoul@lemmy.world 13 points 10 months ago

As if they're not having enough trouble with hosting "questionable" content! You obviously didn't read the torrentfreak article making the rounds.

Internet Archive != the Pirate Bay.

For now, DON'T contaminate the IA with the Classic Chicago Television channel.

[-] empireOfLove@lemmy.one 6 points 10 months ago

Nah. IA doesn't need to deal with this volume of shit and they already have enough of a hard time dealing with copyright trolls.

If this channel is impacted in the future, I'll probably put out a few torrents with the videos and post them here.

[-] NikkiNikkiNikki@kbin.social 13 points 10 months ago

Plan to do this with a lot of the entertainment videos I watch, considering how ban happy some websites have been with content creators, being able to still see their craft after it is gone is worthwhile.

Just need to buy a fuckton of storage though

[-] empireOfLove@lemmy.one 2 points 10 months ago

Me too. There's a couple channels I've downloaded in their entirety, but they're nothing like the size of this one.

[-] RiderExMachina@lemmy.ml 7 points 10 months ago
[-] Sailing7@lemmy.ml 5 points 10 months ago

Nice!

Could you fell us what tool you used to also get the description text and the comments? With dlp i only found the option of downloading the video itself.

[-] totallynotfbi@lemm.ee 7 points 10 months ago

yt-dlp does support fetching comments and description text - if you use the --write-info-json and --write-comments options, it will save them as a JSON file alongside other video metadata.

[-] empireOfLove@lemmy.one 1 points 10 months ago

Yep. Those two and --write-thumbnail

[-] Sailing7@lemmy.ml 1 points 10 months ago

Niiice! Didn't know that this was supported so far.

Thank you mate!

[-] roofuskit@kbin.social 4 points 10 months ago

Let me know when the torrent is up.

[-] skankhunt42@lemmy.ca 2 points 10 months ago

I have room to help seed. Commenting to check back later.

[-] Asimo@lemmy.world 2 points 10 months ago

Probably a silly question but where did you even find that?

[-] empireOfLove@lemmy.one 2 points 10 months ago

Find what? The channel, or the tool?

[-] Asimo@lemmy.world 2 points 10 months ago

Ah found in GitHub. For some reason I never put 2 and 2 together to automate YouTube video downloads. What a noob hah.

[-] empireOfLove@lemmy.one 2 points 10 months ago

Yeah, the yt-dlp fork is still actively maintained and has very nice results, glad you found it.

[-] Asimo@lemmy.world 1 points 10 months ago

Bit of both, the tool would be good to know mainly

[-] venusenvy47@reddthat.com 1 points 10 months ago
[-] empireOfLove@lemmy.one 1 points 10 months ago

Pretty much. You can either point it at the channel OR use a link to their Videos playlist (the playlist you get when hitting "play all"). I usually use the playlist to be consistent.

[-] MonkderZweite@feddit.ch 1 points 10 months ago
[-] empireOfLove@lemmy.one 7 points 10 months ago* (last edited 10 months ago)

What are what for? The downloaded videos?

I just didn't want this cool piece of history to dissappear from the public eye because of corporate retards. probably won't ever watch more than 1% of them.

this post was submitted on 08 Sep 2023
222 points (96.2% liked)

datahoarder

6272 readers
2 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS