Measuring Problem Management Quality in an ITIL Environment, Part 3

As we explored in the first and second articles in this series, assessing the quality of Problem Management within an ITIL environment isn’t always easy. From asking the right questions to determining which metrics should be used to measure performance, there are many variables to consider. In this article, we’ll take a deeper look at the “magic” involved when it comes to identifying a problem’s root cause.

The Essential Components of Problem Analysis

There are many ways to find the root cause of a problem, with some obviously more successful than others. Yet, without a standard framework, different people will naturally have different approaches. The effectiveness of any group of troubleshooters falls somewhere along a bell-curve. Most troubleshooting experts can be confidently given anything to work on. Solid performers are good for most tasks but have room to improve, and those with a poor troubleshooting reputation probably need help.

The Kepner-Tregoe (KT) method for Problem Analysis was researched and defined in the 1950s and has continued to be refined and tested in the decades since. Clearly, this was many years before computers were ubiquitous, let alone before ITIL stood for anything.

It has been argued that a method like KT’s couldn’t possibly be appropriate for the IT industry today, as neither IT nor ITIL existed at the time the KT method was first researched. It’s essential to take a closer look at the KT method for Problem Analysis to make a more appropriate judgment. The major steps in Problem Analysis consist of:

  • Describing the Problem
  • Listing Possible Causes
  • Evaluating Possible Causes
  • Proving the True Cause
  • Thinking Beyond the Fix

For each of these steps, there are clear intentions and appropriate sub-steps developed to help work through the phrasing of questions, as well as in documenting answers to get the right data. All of this input funnels into the Problem Analysis thought process. This is all done without any specific product or issue in mind, and it’s very similar to ITIL, which works for all kinds of IT organizations. Problem Analysis is an approach for finding root causes for many different problems irrespective of the industry or technology.

The 3 Triggers of Problem Analysis

Although KT’s method is appropriate for any type of problem, there is a very specific KT definition of the term “problem” that matches quite nicely with ITIL. According to KT, three criteria must be true before we trigger the Problem Analysis process:

There should be a gap between actual performance and desired performance. This is what we call a deviation. (e.g. machine is not working, versus machine should be working).
The cause for the deviation is unknown (e.g. not a known error).
There must be a need to know the deviation (e.g. enables to take action).
By going through a well-defined set of steps to find root cause, troubleshooters can begin communicating and documenting what has already been done and what remains to be done in the process. An example of how gathered data is modeled to describe the symptoms of a problem is given in the picture below.

In the next and final article, we’ll explore how Problem Analysis provides the core framework to effectively measure performance in ITIL Problem Management.

Measuring Problem Management Quality in an ITIL Environment, Part 1
Measuring Problem Management Quality in an ITIL Environment, Part 2
Measuring Problem Management Quality in an ITIL Environment, Part 4