219
submitted 11 months ago by TheImpressiveX@lemmy.ml to c/asklemmy@lemmy.ml
you are viewing a single comment's thread
view the rest of the comments
[-] JDubbleu@programming.dev 32 points 11 months ago* (last edited 11 months ago)

Not me personally, but one of my career mentor's friend's took down the entirety of Google Ads as an intern for like 10 minutes. Apparently it was a multi-million dollar mistake, but they fixed the issue so it couldn't happen again and all was well afterward.

[-] scubbo@lemmy.ml 47 points 11 months ago

In my first couple months, I broke Amazon so that no-one in Europe could buy video for a few hours. On a Friday, right before going on a week's vacation.

The way that the ensuing investigation and response was carried out - 100% blame-free, and focused on "how did these tools let him down? How can we make sure no-one ever makes that same mistake again?" - gave me a career-long interest in Software Resiliency and Incident Management.

[-] foo@withachanceof.com 28 points 11 months ago* (last edited 11 months ago)

Junior dev: "I fucked up bad, I'm so fired"

Senior dev: " I have 3 production outages named after me lol"

Source: https://twitter.com/CarlaNotarobot/status/1481458190722207747

[-] fubo@lemmy.world 15 points 11 months ago

Yep. And every time there's a thread about an Internet service having an outage, there's some kid saying "oh, someone's getting so fired for this one!"

Yeah, the competent business folks know that if you fire people for outages, you lose everyone who even stands a chance of preventing outages. And you tell the rest of your staff to hide problems. Businesses that do that kind of thing tend to end up with a valuation in the single digits.

[-] Dubious_Fart@lemmy.ml 2 points 11 months ago

Not always.

Sometimes the internet service outage is due to a car taking out a green box or a pole.

[-] fubo@lemmy.world 3 points 11 months ago

If they fire the backhoe driver, then backhoe drivers will never learn.

[-] Dubious_Fart@lemmy.ml 2 points 11 months ago

Oh yeah, That too. Plus trenchers, and hole boring for like fences or billboards.

[-] ikapoz@sh.itjust.works 14 points 11 months ago

If an intern (or damn near any employee) can be in a position to single handedly take down that scale of system it’s not the intern that should be fired - it’s the architect that baked that kind of weakness in the first place.

[-] fubo@lemmy.world 7 points 11 months ago

You're not a real SRE until you've caused at least a $100K outage. You're not a good SRE until you've fixed it so nobody can ever make that particular one again.

this post was submitted on 15 Sep 2023
219 points (97.8% liked)

Asklemmy

43027 readers
1450 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago
MODERATORS