Managing the Problem-Solving Challenge

Fixing problems may be line management’s most critical responsibility. It would be a stretch, however, to say that effective problem solving is common to the industry. Anything less than an excellent performance in this domain can be very expensive.

This paper presents the current state of problem solving in the Nuclear Power Generation sector and further explains how to gain the positive step change in Issue Resolution that everybody wants. On a basic level, it presents the finding that most organizations use diverse tools very inconsistently. The key is to become more systematic in your cross-functional efforts, in the application of tools, and also with the incentives to performers to guide consistent results.

Discussion

At some nuclear plants, lead engineers spend two-thirds of their time — up to 30 hours per week or more — in service to their corrective action programs. Top managers spend significant hours of their own time in the same domain. At minimum, the cost of corrective action efforts is very high in terms of time spent on issue resolution. This does not include the opportunity cost of diverting highly skilled engineering resources away from valuable work such as creating process improvements or other value-added tasks. The question to ask is, “What is the value of the results that we are getting?”

The key to good customer service and economic health for electric generators is high equipment availability and reliability. Other than scheduled downtimes, generation assets need to be ready and running to fulfill their missions. Everyone is happy when the turbines are spinning! However, despite the technical sophistication and automation of the equipment, there is a frustrating trend of recurring problems resulting in forced downtime and increased cost around the industry. From our point of view the contributing factors causing this situation are mismanagement of both equipment and human performance.

It would be nice to have the time to work on and prepare for the long-term health and care of equipment and human assets, but this luxury is not commonly available. Plant personnel are constantly engaged in the “fix it now” mode in order to restore operations as quickly as possible. The time required to think through these issues is a luxury never given.

As a result, opportunities for future-focus are minimized and replaced by fire fighting problem solving. Thus, problems that could have been prevented occur. This cycle repeats itself along with recurring problems, related problems as well as problems caused by the fix.

How do plants move away from this pattern? Must they simply get better at problem solving? The answer is to improve skills for issue resolution in addition to identifying problem causes. A good Issue Resolution process looks at the overall situation surrounding an issue. To properly address an issue, cause must be determined along with a systematic approach to defining a corrective action, along with some logical thought on the implications of taking that corrective action.

The benefits of this systematic approach lead to reduced time and cost spent on recurring problems with the focus moving from “fix it now” to “fix it right.”

The current state of problem solving

As mentioned earlier, fixing problems may be line management’s most critical responsibility. This activity is fundamental to success, but many organizations see it as such a basic function of operations that they assume it is in place and don’t see the need to invest resources to maintain and develop these skill sets. Research shows however, that the best performing plants also have the best organizational cultures for problem solving.

A recent INPO publication references a rise in problems associated with the conduct of daily activities. The publication notes the collective fracture of attention spans as a contributor, and calls for better management oversight and leadership around problem prevention. Overall, the document promotes organizational engagement, and the creation of an environment for creating a thinking organization.

The costs associated with a heavily loaded staff burdened with epidemic time constraints are many. These costs show up in every endeavor, from recurring problems, to significant events, to forced outages and extensions of planned outages. Regulatory concerns and cited findings are negative consequences for lapses in these areas.

There are straight forward costs easily definable around the areas mentioned above. For each of their assets generating organizations can often calculate lost revenue — revenue not being generated — or, lost during downtimes. The load on the system adds to the cost as it may require the purchase of replacement power. These costs show up visibly on financial balance sheets.

What about the costs that do not show up so easily? For example, nuclear plants that become overrun with problems must dedicate huge resources to finding and fixing problems under the watchful eyes of increased NRC scrutiny. Many times additional public scrutiny accompanies these perceived issues with Problem Identification and Resolution. As a result, additional resources and time are expended resolving these issues as quickly as possible. Only when such efforts show success can the nuclear plant escape the watch list. Then they go back to the steady state, meaning that they lose the focus that just got them out of trouble.

Most plants have on average 30 years or more of experience among their lead technical personnel. The comfort level with the high level of experience often leads to complacency around rigorous problem solving. Too often, both the management and technical levels rely too heavily on this experience and make inadequate jumps to causes of problems. Then there can be much time and effort spent incorrectly pursuing ineffective corrective actions aimed more at the effect than at the true cause.

Below are a few of the specific issues influencing mediocrity in this area:

What’s the problem?

It is a challenge to get a small group of people on the same page. Imagine the difficulties of getting over 1,000 people to focus on a primary task like problem solving. One typical obstacle is that groups of technical personnel don’t agree to a definition or scope for the term “problem.” The eyes and ears of the plant are the Operators and Technicians who are out there every day. However, if they aren’t consistent and accurate in recognizing, reporting, and collecting problem-related data, how can the cause analysts that follow them perform effectively?

When data collection isn’t timely, key data is lost, assumptions become facts, and analysis quality is low. Problem identification and resolution efforts become personnel based rather than being based on a uniform systematic approach with success being dependent on the skills of the individual.

Procedures aren’t effective

Every plant has procedures that cover troubleshooting and cause analysis. The procedures tend to mirror regulatory guidance, meaning that the activities and requirements are described at a very high level. At the working level, however, personnel face confusion about what tools to use and how to use them.

Additionally, functions such as Operations, Engineering, and Maintenance often have separate procedures and methodologies for troubleshooting and root cause analysis. Cross-functional communications lose effectiveness and strength in these conditions. In the end, procedures can be more of an obstacle than a help. As a result, the focus is on filling out the form or ticking off the step in the procedure rather than the quality of the output.

Skill development isn’t enough

Over the years plants typically have sponsored a variety of training to wide audiences on problem solving techniques. Rather than real skill development, many of the techniques barely reach the awareness level and rarely is there continuing training to keep skills fresh. However, the larger issue is that there is no clear path matching what technique for which application as well as no expectation of use from the organization. Thus, users attempt to use skills ineffectively, leading to frustration and then reverting to methods that have worked in the past for the individual.

What’s the priority?

Competition for skill sets has put a heavy squeeze on plant staffing. Most people, particularly the leads, have a lot of assignments with tight deadlines. Unfortunately, everything is a priority, so the level of thinking gets reduced to focusing on interim actions performed to restore conditions by the deadline. The rationale is that the issue was fixed and plant operations were restored, but the question of how many times the issue has needed to be fixed fall to the bottom of the list.

At the macro level, many management teams fail to see problem solving as a value-added thrust. Corrective Action Programs (CAPs) in the nuclear industry are a good example. Too frequently, CAP activities are viewed as administrative functions. Outcomes are measured in terms of filing paperwork within designated time windows rather than reaching a high quality level of analysis and response. Because the CAPs are viewed as a compliance activity by management, a clear signal is sent to the staff that minimizes necessary engagement and rigor.

Is it equipment and/or people?

When things go wrong, it can be very difficult to segregate the issue to mechanical causes versus human performance issues. Possibly both of these variables worked together to yield a deviation, yet rarely is there a good model in place to unravel such scenarios. Untangling these multiple issues and taking ineffective actions to address the individual effects rather than the cause are time consuming and expensive.

…and what are we going to do about problems?

Problem solving can be overwhelming, but the real challenge is to design and implement corrective actions and preventive actions. Again, an effective model or framework rarely exists. Typical outcomes include: incorrect fixes to misunderstood problems, too many fixes and lack of consensus on the best path forward all occur.

There is hope that enough actions will make the symptom go away, but wishing doesn’t help. The real measure of success of effective decision making is an evaluation of whether or not the objectives behind the decision were achieved. It is difficult to do if the objectives are not understood uniformly from the start.

Systematic problem solving and corrective actions

Daily Activities to Pursue — The best way to solve problems is to not have them in the first place. Therefore, capitalize on problem prevention efforts. Set aside time and establish measures and responsibilities designed to avoid problems. Problem prevention is critical to high equipment reliability. Here are a few daily activities to pursue.

Establish the Problem Prevention Expectations — The effort to correct problems can be greatly reduced with a more effective focus on systematic problem prevention. Management must lead the culture change toward this goal. For every actual problem, questions should be asked regarding how to prevent such problems in the future. For every activity – from daily tasks to planned outages – make sure that planning includes problem prevention considerations.

These actions result in key benefits. Industry Operational Effectiveness and trending efforts are maximized. Increased levels of equipment reliability and performance are achieved. Outage performance is improved, as issues are identified before implementation. Staff efforts become more future-focused, as high value engineering resources put time against problem prevention instead of problem solving. Budgets become more predictable and costs associated with problem resolution go down.

Capture Changes Systematically — If a performance was on target, and then a deviation occurred, by definition there has been a change. Much of the problem solving effort must be focused on finding the change that is related to the true cause of the deviation. Power plants make changes every day. Most of them achieve their expected purpose. However, some changes result in new problems, too. The key is to capture the changes and tweaks to equipment and programs in a systematic way.

Significant time and effort will be saved in understanding why deviations may have occurred. Going forward, people will learn how to troubleshoot potential changes, and then better avoid problems that might be caused by a change

When Deviations Actually Occur — Understanding problems effectively requires a rational framework for gathering and organizing information so that it can be evaluated for true cause. Developing best fixes to problems requires a similar framework that balances fix objectives with possible risks. When these conditions are met, both plant performance and regulatory relationships are optimized.

Enhance the First Observer Data Collection Step — Plant personnel need to be provided with a specific set of questions to ask as well as information to collect regarding observed deviations. Lack of accuracy, missing information, assumptions, faulty conclusions, etc. will cause the entire process to perform at lower standards.

Only through good problem identifications can apparent cause and root cause analyses be accomplished effectively. This is the basis for any further cause evaluation. In addition, the early accuracy of problem identification is essential for correct extent-of-condition searches. Further, trend analyses will see greatly increased validity.

Clarify the Equipment/Human Performance/Organization Distinctions — Only through a clear understanding of the interface between people and equipment can real issues be resolved. Workable distinctions need to be clearly recognized, and analysis methodologies and the capabilities to apply them must be put in place.

The organizational influencers that often go unrecognized can be structured to drive the right performance rather than causing poor performance. Human performance issues have identifiable causes that can be addressed but the key to success is to go deeper than statements such as “operator error” and “weak procedures” in the quest for cause.

Develop Expert Cause Analysts — Despite broad staff experience, responsible problem solvers really benefit from expert leadership and coaching when trying to understand and resolve deviations. Basically, this approach matches unbiased questioners with the content expertise of problem “owners.” The ideal state is to prepare a cadre of personnel with good leadership and communications skills to this important function. Keep them engaged so that skills stay fresh. In addition to providing a process facilitator with the appropriate tools, a performance system that allows and encourages their active participation is a necessity.

With this expertise, apparent and root cause analyses are performed in a consistent manner, are of higher quality, and are produced more quickly. The critical thinking skills of the entire organization are improved from senior leadership down through operators.

Define a Structured Approach for Fixes — Corrective actions are where results happen, and where resources are consumed. At the end of the day, it is in this mode that asset optimization is achieved. Corrective action thinking has to have a place in the strategic planning cycle, and has to be balanced with safety and production concerns.

A shared decision making framework between management and technical resources will help develop the best paths of action that will satisfy safety, production, and financial goals. A structured risk assessment step will ensure that threats are considered carefully. Further, an element of project management can be applied to implementation planning early on, which will help manage the path forward.

This eliminates the discrepancy between technical objectives for a solution and what management wants and reduces time and effort pursuing ineffective solutions. Alternatives are rationally selected that meet formal objectives with manageable risks. A clear line connecting problems and fixes is established. Best of all, the number of fixes becomes more manageable. Preferred alternatives are documented, and easy to measure for success. This application of valid measures will prove the strategic value of the Corrective Action Program.

The framework for sustainable implementation — effective change management

The overall driving need is to move from the current compliance mind-set, to having a strategically-focused, highly effective problem solving capability. In short, such a capability maximizes equipment reliability and provides a positive cost-benefit to the site. Once in place, the noticeable culture change is a move from the daily focus on what’s broken, to having hundreds and perhaps thousands of hours of staff time available to head off problems and promote system health.

Moving to a systematic problem solving process is not unlike the implementation of any other lasting process change. Sustainable change implementation is focused on the four concepts shown in the model below.

The procedure system and business processes drive work on the job. Analysis and adjustment of these processes is vital to success. Skill development around solid problem solving skills is needed to give the performers the tools that they need. From there, most organizations need some dedicated coaching until new skills are learned. Then planned change is solidified when the human performance system is aligned properly with the desired behavior.

  1. Improve Processes and Procedures — The Problem Solving and Resolution 1. process is a key business process for any organization. Improving this process first requires a tight description of desired staff responsibilities and cross-functional information handoffs along the problem solving/corrective action flow. Process tools and procedures then are embedded in the process, and the parts have to add up to the whole. Observable performance measures that allow senior management to monitor and evaluate the process must be included. Only then can useful initial and continuing training be developed and provided.
  2. Develop Effective Problem Solving Skills — Gaining this capability certainly requires an accurate description of responsibilities with performance measures for various levels in an organization. Around responsibilities, a targeted skill development effort can be applied. What are the expectations of knowledge level and process use for each level? At each responsibility layer, from problem identification through development and implementation of the fixes, expertise to resolve simple to complex problems has to be promulgated and maintained.
  3. Shape Human Performance — With human performance, you get what you ask for, sometimes. If a site is suffering from a low-performing culture of problem solving despite a high input of time and effort, there likely are clear causes rooted in the symptoms described earlier in this paper. Simply giving the performers the skill sets are not enough. There must be clear expectations set as to what good use of process looks like. The actual performance needs to be monitored with pinpointed feedback given to the performer to move them toward the desired performance. The balance of consequences for the performer and the organization need to be such that the expected behavior is encouraged. These factors all function together to form the Human Performance System.
  4. Support the Supervisors — Of specific note, it is critical that the supervisory level is prepared to support change. It is also critical that second and third-level managers support first-level supervisors in making changes, as implementation is where initiatives typically fail. Management wants quality, but the message heard at the supervisory level is production. A balance has to be achieved, and it can only be achieved with supervisory involvement.
  5. Coach On-Job Applications — Too often expected change is launched by management directive without the necessary implementation coaching. People need some time and practice to develop confidence and competence under expert coaching. This phase is particularly important at the beginning of use of rational process tools. The goal is to move the performer to proficiency and thus a reduction on the need for coaching support.

Conclusion

The factors that influence the performance level of an organization’s problem solving processes extend beyond simply giving people the right tools. The interactions of the various success factors discussed in this paper can be complex and weaknesses not easily identifiable.

If the reasons for sub-par performance of problem solving efforts are unclear or have been approached from many angles with limited success, Kepner-Tregoe can perform an independent assessment to identify the weaknesses and provide a structured plan for improvement.

If there is a clear understanding of what’s not working well, Kepner-Tregoe offers a pinpointed approach in order to address the broken piece(s) of the puzzle.

The target of our approach is to transfer the capability to manage problem solving and corrective actions to our clients. The change needs to be implemented properly so that it is permanent and sustainable. To that end we work with our clients until they are ready to take over the implemented tools and processes through self-management. Once in place, across-the-board improvements in plant performance are observed, regulatory relationships are improved, and the organization’s position in the marketplace is advanced.

Related

When a Power Plant Shuts Down – KT Processes Kick In

What’s Missing From Your Asset Performance Management Strategy?

Contact Us

For inquiries, details, or a proposal!