195

Looking to maybe self host my own instance, I'm still learning about the fediverse. If a different instance that I federate with hosts something illegal are there risks to me? Is anything from other instances hosted on my server like a copy of it? Or would I only end up hosting things my users post? I'm paranoid and sorry if this is a silly question.

you are viewing a single comment's thread
view the rest of the comments
[-] vynlwombat@lemmy.world 6 points 1 year ago

How much disk space would some need to plan for a small lemmy instance?

[-] pe1uca@lemmy.pe1uca.dev 8 points 1 year ago

I'm running it in the smallest VPS of vultr with 25GB of disk.
This instance only has 3 users, with me being the only active. It says it's been up for almost a month and I've only used 3GB.

Here are the docker volumes which have the actual data of your instance, and from inside the DB the biggest table is the one called activity which the devs said it's only sometimes used to validate the data, but could be truncated if needed (there's a schedule task which only keeps up to 6 months).
Also the thing to have in mind is to properly configure the logs of whichever installation guide you follow.
After that I've seen other admins say the next biggest is the media uploaded (from bigger instances).

$ du -h --max-depth=1
640K    ./pictrs
3.2G    ./postgres
3.2G    .

lemmy=# select
  table_name,
  pg_size_pretty(pg_relation_size(quote_ident(table_name))),
  pg_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;
         table_name         | pg_size_pretty | pg_relation_size
----------------------------+----------------+------------------
 activity                   | 2187 MB        |       2292867072
 comment                    | 56 MB          |         58212352
 person                     | 48 MB          |         50307072
 comment_like               | 45 MB          |         47161344
 post_like                  | 22 MB          |         22781952
 comment_aggregates         | 14 MB          |         14811136
 post                       | 13 MB          |         13623296
[-] gabe565@lemmy.cook.gg 6 points 1 year ago

The activity table is also used to deduplicate incoming federation data, so instead of truncating it, I'd suggest deleting rows after a certain amount of time.

For my personal instance, I set up a cron to delete entries older than 3 days, and my db is only ~500MB with a few weeks of content! I also haven't seen any duplicated posts or comments. Even with Lemmy's retries, 3 days seems to be long enough before dropping rows from that table.

[-] ipkpjersi@lemmy.one 2 points 1 year ago* (last edited 1 year ago)

Could you share the cron/script you use to do this? I'm interested in hosting my own Lemmy at some point, and having a script for that cleanup would be hugely helpful for me.

[-] gabe565@lemmy.cook.gg 2 points 1 year ago

Definitely! I'm hosting in Kubernetes so I won't post the full thing, but here's the actual command that I run hourly. Make sure to replace the values for database, username, and password.

PGPASSWORD=password psql --dbname=database --username=username --command="DELETE FROM activity WHERE published < NOW() - INTERVAL '3 days';"
[-] ipkpjersi@lemmy.one 1 points 1 year ago

Awesome, that was just as straightforward as I was hoping it was, thanks! I am more familiar with MySQL as I haven't used Postgres a ton but SQL is SQL after all lol

[-] Thief@lemmy.myserv.one 1 points 1 year ago

Hi - can you help me set this up or share the script that you use to do this? Many thanks :)

[-] pe1uca@lemmy.pe1uca.dev 1 points 1 year ago

Ah! I didn't know exactly what was being used for.
Yeah, then it can only be trimmed, not truncated.

This is a great idea, thank you!

[-] redcalcium@c.calciumlabs.com 1 points 1 year ago

I'll have to try this later. Thanks for the tip!

[-] Thief@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Can you help me set this up also or share the script I would run to do this? Many thanks.

[-] gabe565@lemmy.cook.gg 1 points 1 year ago

Sure! My script will look a little different since I'm hosting Lemmy in Kubernetes, but basically you will want to run the following command hourly. Make sure to replace the values for database, username, and password.

PGPASSWORD=password psql --dbname=database --username=username --command="DELETE FROM activity WHERE published < NOW() - INTERVAL '3 days';"
[-] Thief@lemmy.myserv.one 1 points 1 year ago
load more comments (10 replies)
load more comments (11 replies)
this post was submitted on 02 Jul 2023
195 points (97.6% liked)

Selfhosted

39154 readers
299 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS