Me at my first job: "A bug? Oh, no! We MUST fix it!"
Me at Microsoft: "A bug? Oh, no! We CAN'T fix it!"
Me now: "A bug? Let's talk!"
When I started my career over twenty years ago, I found dealing with bugs easy: any reported defect had to be fixed as soon as possible. This approach could work because our company was small, and the software we built was not complex by today's standards.
My early years at Microsoft taught me the opposite: fixing any bug is extremely risky and should not be taken lightly. At that time, I worked on the .NET Framework (the one before .NET Core), which had an extremely high backward compatibility bar. The reason for this was simple: .NET Framework was a Windows component used by thousands of applications. Updates, often included in Windows Service Packs, were installed in place. Any update that changed the .NET Framework behavior could silently break users' applications. As a team, we spent more time weighing the risk of fixing bugs than fixing them.
Both these situations were extremes and wouldn't be possible today. As software complexity skyrocketed and users can choose from many alternatives, dealing with bugs has become much more nuanced. Impaired functionality is one, but not always the most important, aspect to consider when prioritizing fixing a bug. Here are the most common criteria to look at when triaging bugs.
Is it a bug?
While in most situations, there is no doubt that a reported issue is a bug, this is not always the case. While on the ASP.Net Core team, users reported some bugs they expected or preferred an API to behave differently. It did mean, however, that these issues were valid. For example, if you are building a spec-compliant HTTP server, you can't fix the typo in the Referer
HTTP header regardless of how many bugs asking you to do so you receive.
Security
Security bugs and vulnerabilities can lead to unauthorized access to sensitive data, financial losses, or operational disruptions. As they can also be exploited to deploy malware and infiltrate company networks, fixing security bugs is almost always the highest priority.
Regulatory and Compliance
Regulatory and Compliance bugs are another category of high-priority bugs. Even if they don't significantly impact the functionality, they may have serious legal and financial consequences.
Privacy
Bugs that lead to the disclosure of sensitive data are treated very seriously, and fixing them is always a top priority. In the U.S., the law requires that businesses and government agencies report data breaches.
Business Impact
Business impact is an important aspect of determining a bug's priority. Bugs that impact company revenue or other key business metrics will almost always be a higher priority than bugs that do not impact the bottom line.
I worked at Amazon during Amazon's 2018 Prime Day. Due to extremely heavy traffic, the website experienced issues for hours, making shopping impossible. Bringing the website to life was the top priority for the company that day, followed by months of bug fixing and reliability improvements.
Functional Impact
Impact on functionality is usually the first thing that comes to mind when hearing the word "bug." Rightly so! Functionality limited due to bugs leaves users extremely frustrated. Even small issues can lead to increased customer support tickets, customer loss, or damage to the company's reputation.
Timing
Timing could be an important factor in deciding whether or not to fix a bug. Hasty bug fixes merged just before releasing a new version of a product can destabilize it and block the release. Given the pressure and shortened validation time, assessing the severity of these bugs and the risk of the fixes is crucial. In many companies, bugs reported in the last days before a major release get a lot of scrutiny, and fixing them may require the approval of a Director or even a VP.
The cost and difficulty of fixing the bug
Sometimes, bugs are not getting fixed due to the high cost. Even serious bugs may be punted for years if fixing them requires rewriting the product or has significant undesirable side effects. Interestingly, users often get used to these limitations with time and learn to live with them.
The impact of fixing a bug
Every bug fix changes the software's behavior. While the new behavior is correct, users or applications may rely on the old, incorrect behavior, and modifying it might be disruptive. In the case of the .NET Framework, many valid bugs were rejected because fixing them could break thousands of applications.
The extreme case of a bugfix that backfired spectacularly was when our team fixed a serious bug in one of the Microsoft Windows libraries. The fix broke a critical application of a big company, an important Microsoft customer. The company claimed that fixing and deploying the application on their side was not feasible. As we didn't want to (and couldn't) revert the fix that shipped to hundreds of millions of PCs worldwide, we had to re-introduce the bug to bring back the old behavior and gate it behind a key in the Windows registry.
Is there a workaround?
When triaging a bug, it is worth checking if it has an acceptable workaround. Even a cumbersome workaround is better than being unable to do something because of a bug. A reasonable workaround often reduces the priority of fixing a bug.
If you found this helpful, please share it with a friend and consider subscribing if you haven’t already.
Thanks for reading!
-Pawel