6
submitted 1 year ago* (last edited 1 year ago) by ticoombs@reddthat.com to c/reddthat@reddthat.com

This is terribly hard to write. If you flushed your cache right now you would see all the newest posts without images. These are now 404s, even thought the images exist. In 2 hours everyone will see this. Unfortunately there is no going back, recovering the key store for all the "new" images.

What happened?

After the picture migration from our local file store to our object storage, i made a configuration change so that our Docker container no longer reference the internal file store. This resulted in the picture service having an internal database that was completely empty and started from scratch ๐Ÿ˜”

What makes this worse is that this was inside the ephemeral container. When the containers are recreated that data is lost. This had happened multiple times over the 2 day period.

What made this harder to debug was our CDN caching was hiding the issues, as we had a long cache time to reduce the load on our server.

The good news is that after you read this post, every picture will be correctly uploaded and added to the internal picture service database! ๐Ÿ˜Š The "better" news is the all original images from the 28th of June and before will start working again instantly.

Timeframe

The issue existed from the period from 29th of June to 1st of July.

Resolution

Right now. 1st of July 8:48 am UTC.
From now on, everything will work as expected.

Going forward

Our picture service migration has been fraught with issues and I cannot express how annoyed and disheartened by the accidents that have occurred. I am yet to have provided a service that I would be happy with.

I am very sorry that this happened and I will strive to do better! I hope you all can accept this apology

Tiff

top 2 comments
sorted by: hot top controversial new old
[-] jeremy@reddthat.com 0 points 1 year ago

Containers are complex! You're doing great.

Honestly, stateful services in containers are... Often a lot of work.

[-] tsz@reddthat.com 0 points 1 year ago

often a lot of work

I have yet to find a legitimate use case where the infrastructure, time, etc required to get this right results is a better product.

this post was submitted on 01 Jul 2023
6 points (100.0% liked)

Reddthat Announcements

641 readers
1 users here now

Main Announcements related to Reddthat.

founded 1 year ago
MODERATORS