[-] supakaity@lemmy.blahaj.zone 28 points 3 months ago* (last edited 3 months ago)

The pict-rs upgrade is ongoing.

From what I can tell it'll be about another 5 hours. I'm going to have to go to bed and check on it in the morning.

Unfortunately the stock-standard lemmy-ui doesn't like it that pict-rs is migrating to a new version of database and not serving images, so it's stubbornly just not working at all.

100
Lemmy updated to v0.19.5 (lemmy.blahaj.zone)

Hey all!

Our lemmy.blahaj.zone has been updated to v0.19.5.

Let us know if you notice any issues with the upgrade!

[-] supakaity@lemmy.blahaj.zone 24 points 5 months ago

Our best haj, Shonky (they/them) is available now, over at their own Github repository for use when referring to Blåhaj Lemmy.

I'm guessing we'll need them to make an appearance for the Canvas template.

50
Test upload image (lemmy.blahaj.zone)

Just testing image upload.

114
Alternative frontends (lemmy.blahaj.zone)

Hi all our lovely users,

Just a quick post to let you all know that along-side the upgrade to 0.19.3, we've also added a couple of alternate UIs to the Blåhaj lemmy for you.

Obviously the default lemmy-UI at https://lemmy.blahaj.zone still exists and has been updated to 0.19.3 alongside the lemmy server update.

There's also now an Alexandrite UI at https://alx.lemmy.blahaj.zone which is a more modern, smoother UI, written in svelte, by sheodox.

And then for those who are nostalgic for reddit days of yore, and memories of when PHP websites last ruled the earth, there's MLMYM (courtesy of rystaf) at https://mlmym.lemmy.blahaj.zone.

Please enjoy, and I hope the upgrades work well for you.

25
¡La mariposa, muy bonita! (lemmy.blahaj.zone)

Esto es una prueba

46
Testing image upload (lemmy.blahaj.zone)

Test

[-] supakaity@lemmy.blahaj.zone 30 points 1 year ago

Migration has been completed!

115
submitted 1 year ago* (last edited 1 year ago) by supakaity@lemmy.blahaj.zone to c/main@lemmy.blahaj.zone

We're currently in the process of migrating our pict-rs service (the thing responsible for storing media/images/uploads etc) to the new infrastructure.

This involves an additional step of moving our existing file-based storage to object storage, so this process will take a little time.

New images/uploads may not work properly during this migration, however existing images should continue to load. We expect this migration to take about an hour.

[EDIT]

Migration has completed.

685,271 files / 153.38 GB were migrated. Copying to object storage took about 1.5 hours. Starting service back up on new server and debugging took another 30 minutes.

Timeline:

  • Migration started at 2023-10-01 22:43 UTC.
  • [+1h32m] Objects finished uploading to object storage at 2023-10-02T00:15 UTC.
  • [+2h06m] Migration was completed at 2023-10-02 00:46 UTC.
65
submitted 1 year ago* (last edited 1 year ago) by supakaity@lemmy.blahaj.zone to c/main@lemmy.blahaj.zone

Blåhaj Lemmy will be down for database migration to the new servers in approximately 1.5 hours from now (06:00 UTC).

Downtime is estimated at under an hour.

I will have more details on the maintenance page during the migration and update the status as the migration progresses.

[-] supakaity@lemmy.blahaj.zone 70 points 1 year ago

I have been watching my love tie herself in knots over the last several days, having to deal with the drama that has been brought on, trying her best to bring everyone back together.

There's been bad behaviour from both sides, and I'm really disappointed to see that some of the worst of it came from our users, who didn't keep to the moral high ground, disregarded our instance rules and stoopped to levels of behaviour worse than that leveled against them.

There have been accusations against us (or Ada specifically) that we are a safe harbour for bad behaviour and cause harm to trans people through our inaction.

This is perhaps the cruelest accusation they could have leveled at Ada, as she works tirelessly to maintain a safe space for our community, and while I was hoping, for all the effort that she was investing into this issue, that she could make it work despite my own reservations, this last attack on her impeccable morality has made me very angry.

I'm sorry for those that wanted to remain federated, sorry that it came to this, but I am glad it's over now, purely for the mental health of my precious beloved.

[-] supakaity@lemmy.blahaj.zone 25 points 1 year ago

Okay, so that was way more painful than expected... /sigh

56

The server will be briefly down while we install a new updated version of lemmy and restart it.

The maintenance window is 15 minutes, but should be much shorter.

237
State of the shork! (lemmy.blahaj.zone)

So it's been a few days, where are we now?

I also thought given the technical inclination of a lot of our users that you all might be somewhat interested in the what, how and why of our decisions here, so I've included a bit of the more techy side of things in my update.

Bandwidth

So one of the big issues we had was the heavy bandwidth caused by a massive amount of downloaded content (not in terms of storage space, but multiple people downloading the same content).

In terms of bandwidth, we were seeing the top 10 single images resulting in around 600GB+ of downloads in a 24 hour period.

This has been resolved by setting up a frontline caching server at pictrs.blahaj.zone, which is sitting on a small, unlimited 400Mbps connection, running a tiny Caddy cache that is reverse proxying to the actual lemmy server and locally caching the images in a file store on its 10TB drive. The nginx in front of lemmy is 301 redirecting internet facing static image requests to the new caching server.

This one step alone is saving over $1,500/month.

Alternate hosting

The second step is to get away from RDS and our current fixed instance hosting to a stand-alone and self-healing infrastructure. This has been what I've been doing over the last few days, setting up the new servers and configuring the new cluster.

We could be doing this cheaper with a lower cost hosting provider and a less resiliant configuration, but I'm pretty risk averse and I'm comfortable that this will be a safe configuration.

I woudn't normally recommend this setup to anyone hosting a small or single user instance, as it's a bit overkill for us at this stage, but in this case, I have decided to spin up a full production grade kubernetes cluster with a stacked etcd inside a dedicated HA control plane.

We have rented two bigger dedicated servers (64GB, 8 CPU, 2TB RAID 1, 1 GBPS bandwidth) to run our 2 databases (main/standby), redis, etc on. Then a the control plane is running on 3 smaller instances (2GB, 2 CPU each).

All up this new infrastructure will cost around $9.20/day ($275/m).

Current infrastructure

The current AWS infrastructure is still running at full spec and (minus the excess bandwidth charges) is still costing around $50/day ($1500/m).

Migration

Apart from setting up kubernetes, nothing has been migrated yet. This will be next.

The first step will be to get the databases off the AWS infrastucture first, which will be the biggest bang for buck as the RDS is costing around $34/day ($1,000/m)

The second step will be the next biggest machine which is our Hajkey instance at Blåhaj zone, currently costing around $8/day ($240/m).

Then the pictrs installation, and lemmy itself.

And finally everything else will come off and we'll shut the AWS account down.

[-] supakaity@lemmy.blahaj.zone 38 points 1 year ago

So, one thing I'd mention is the systems and admin work involved in running an instance.

This is on top of the community moderation, and involves networking with other instance admins, maintaining good relations, deciding who to defeferate from, dealing with unhappy users, etc.

Then there's the setup and maintenance of the servers, security, hacks, DDoSing, backups, redundancy, monitoring, downtime, diagnosis, fixing performance issues, patching, coding, upgrades etc.

I wouldn't be here doing this without @ada. We make a formidable team, and without any self effacement, we are both at the top of our respective roles with decades of experience.

Big communities also magnify the amount of work involved. We're almost at the point where we are starting to consider getting additional people involved.

Moreover we're both here for the long haul, with the willingness and ability to personally cover the shortfall in hosting costs.

I'm not trying to convince you to stay here. But in addition to free hardware, you're going to need a small staff to do these things for you, so my advice is to work out if you have reliable AND trustworthy people (because these people will have access to confidential user data) who are committed to do this work long term with you. Where will you be in 3 years, 5, 10?

[-] supakaity@lemmy.blahaj.zone 79 points 1 year ago

To be clear, $3k is an accurate, but unacceptable amount.

As in that's what it's actually costing us, but it's not what it should be costing. I'd imagine more like $250 is what we should be paying if I wasn't using AWS in the silly way I am.

I'm admitting up front that I've been more focused on developing rather than optimising operating costs because I could afford to be a little frivolous with the cost in exchange for not having to worry about doing server stuff.

Even when the Reddit thing happened I was wilfully ignoring it, trying to solve the scaling issues instead of focusing on the increased costs.

And so I didn't notice when Lemmy was pushing a terabyte of data out of the ELB a day. And that's what got me.

About half that $3k is just data transfer costs.

Anyhow the notice was just to let our users know what is going on and that there'll be some maintenance windows in their future so it doesn't surprise anyone.

We have a plan and it will all work out.

Don't panic or have any kneejerk reactions, it's just an FYI.

[-] supakaity@lemmy.blahaj.zone 47 points 1 year ago

Just want to say, I don't blame anyone else but myself.

I certainly don't blame anyone at 196.

I hope I'm really clear about that. It's one of the reasons I specifically didn't name 196 in my announcement.

We've got a solution planned, we've already started to implement it and have the image transfer issue solved already.

We can afford to cover this ridiculous AWS bill, I just need to do some maintenance work so this doesn't continue because I can't continue to line Jeff Bezos' pockets like this indefinitely.

134

Discussion of the current situation with the Blåhaj instances, and upcoming maintenance.

99
Lemmy updated to v0.18.2 (lemmy.blahaj.zone)

Our lemmy is now running the 0.18.2 release version, which should fix some lingering issues we've been having.

Let @ada or myself know if there's any issues!

[-] supakaity@lemmy.blahaj.zone 28 points 1 year ago

Migration complete.

62

Hi everyone, I'll begin migrating the lemmy blåhaj database to the new server this morning in about 20 minutes.

Expected duration is about 1 hour for this migration.

There will be a maintenance page up during the migration and I will be updating the status as we go to keep you updated on the process.

Later today I'll also be upgrading the software to the latest release as well.

[-] supakaity@lemmy.blahaj.zone 28 points 1 year ago

It wasn't an actual emojo. The script processed the SQL header column names as an emojo and tried to add them. Unfortunately publicUrl is not a valid URL, so lemmy's /api/v3/site metadata endpoint started returning an error relative URL without a base instead of the JSON that the website was expecting and so the site just stopped working for everyone the next time it tried to load that url.

[-] supakaity@lemmy.blahaj.zone 30 points 1 year ago* (last edited 1 year ago)

How do you know I haven't always been the hacker who's in control? :D

[-] supakaity@lemmy.blahaj.zone 22 points 1 year ago

Okay, I'm taking down the server now to upsize thew instance hardware.

view more: next ›

supakaity

joined 2 years ago
MODERATOR OF