A while back I spoke about what is MTTR and MTTD and why they matter in software delivery or at an organization. Today I’ll go in a bit more in depth and talk about how to reduce incident resolution time at your company. First, I’ll go into some general basics but this article assumes you already understand MTTR and are able to measure it. If you aren’t there yet, I can write another article or read my previous one to get an idea. Let’s get into it! How do you reduce MTTR? (mean time to resolution) Mean Time to Resolution
Welcome back! Here with another article today and to discuss some of the problems with isolated team level kanban and why you should aim to go as high as possible in the organization. I’ll discuss some of my experiences and go into the downsides of isolated team level kanban vs the organization level. This article isn’t here to downplay the impacts of Kanban and lean principles but show some of the downsides when you’re alone without support. For full context, I think Kanban and Lean are great and have shaped a lot of the background of what I do today.
Have you ever worked somewhere where they deployed once a quarter? I have. It sucks and it’s super risky. On the other hand, I’ve been at places where we push to production over 1000+ times a week. “But we have 75 people on the call and they’re all paying attention”. Yeah, OK. I’ve been on these and I’ve heard people sleeping. Midnight calls suck and sleep deprived people who are deploying large amounts of code with a lot of steps manually is RIPE for error. Making mistakes happen and “short deployments” turn into hours and you get delayed even further.