550

Lemmy search isn’t great, or I’m too new, and can’t tell if this has been posted here before.

you are viewing a single comment's thread
view the rest of the comments
[-] JoeyJoeJoeJr@lemmy.ml 2 points 1 year ago

I would imagine the source for most projects is hosted on GitHub, or similar platforms? Perhaps you could consider forks, stars, and followers as "votes" and sort each sub category based on the votes. I would imagine that would be scriptable - the script could be included in the awesome list repo, and run periodically. It would be kind of interesting to tag "releases" and see how the sort order changes over time. If you wanted to get fancy, the sorting could probably happen as part of a CI task.

If workable, the obvious benefit is you don't have to exclude anything for subjective reasons, but it's easier for readers of the list to quickly find the "most used" options.

Just an idea off the top of my head. You may have already thought about it, and/or it may be full of holes.

[-] vegetaaaaaaa@lemmy.world 7 points 1 year ago* (last edited 1 year ago)

would imagine that would be scriptable - the script could be included in the awesome list repo, and run periodically.

The next version of the list will be based on https://github.com/awesome-selfhosted/awesome-selfhosted-data (raw YAML data), so much easier to integrate with scripts. There is already a CI system running at https://github.com/awesome-selfhosted/awesome-selfhosted-data/actions, and a preview of an enriched export at https://nodiscc.github.io/awesome-selfhosted-html-preview/ that take stars/last update dates and other metadata into account. This will all go live "soon".

Perhaps you could consider forks, stars, and followers as “votes” and sort each sub category based on the votes.
it’s easier for readers of the list to quickly find the “most used” options.

This would exclude (or move to the bottom of the list) all projects that are not hosted on these (mostly proprietary) platforms. Right now only metadata from Github is being parsed, in the future it will expand to Gitlab, maybe Gitea instances or similar, but it will take time and not all platforms have these stars/followers/forks features. This would also induce a huge bias as Github projects will have a lot more forks/followers/... than projects hosted on independent forges. Star counts can also (and absolutely are) manipulated by some projects that want to get "trending".

Also popularity != quality. A project whose code is hosted on cgit can be as good or even better than a project on Github (even more in the context of self-hosting...).

Just an idea off the top of my head. You may have already thought about it, and/or it may be full of holes.

It was a good idea :) But as you can see, it has its flaws.

[-] JoeyJoeJoeJr@lemmy.ml 1 points 1 year ago* (last edited 1 year ago)

it has its flaws.

Yep yep. I was aware of some of what you pointed out - I think this might be a "perfect is the enemy of good" scenario, though. GitHub alone accounts for over 84% (based on the awesome-selfhosted-data repo):

$ grep -r 'source_code_url' | cut -d ' ' -f 2 | cut -d '/' -f 3 | sort | uniq -c | sort -rn | head -n 15
   1068 github.com
     36 gitlab.com
      7 git.mills.io
      6 sourceforge.net
      6 framagit.org
      4 www.atlassian.com
      4 codeberg.org
      3 git.drupalcode.org
      3 git.cloudron.io
      2 repos.goffi.org
      2 git.tt-rss.org
      2 git.sr.ht
      2 cvsweb.openbsd.org
      1 yetishare.com
      1 www.wiz.cn

$ python -c "print($(grep -r 'source_code_url' . | grep github.com | wc -l) / $(ls -1 | wc -l))"
0.8422712933753943

Adding in gitlab gets you to 87%:

$ python -c "print($(grep -r 'source_code_url' . | grep -i -e github.com -e gitlab.com | wc -l) / $(ls -1 | wc -l))" 0.8706624605678234

Also popularity != quality.

True, but a thriving community generally means more resources, guides, etc, which can be important, especially for self-hosted solutions.

In any case, the project is great, and much appreciated. Additionally, the enriched html version looks fantastic, and exposes most of the metadata* I'd want to see, regardless of how it's sorted.

*One other item to track, that I thought about after making my previous comment - number of contributors. It gives an additional data point on the size of the community, as well as an idea of how many people can be hit by busses before the continued development of the project gets called into question.

[-] vegetaaaaaaa@lemmy.world 1 points 1 year ago

Thanks, you make an interesting point, I will have another look at it.

track [...] number of contributors

That would be an interesting stat, noted https://github.com/awesome-selfhosted/awesome-selfhosted-data/issues/35. Github API rate limits will make it a bit tricky to update regularly though (https://github.com/nodiscc/hecat/issues/112), but it can definitely be done - even if it's not updated every day.

this post was submitted on 22 Jul 2023
550 points (98.8% liked)

Selfhosted

39143 readers
264 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS