Contact us

Nuclear power plant reliability depends on a culture of problem solving

At many nuclear plants, lead engineers spend two-thirds of their time in service to their corrective action programs. Top managers also spend significant hours in the same domain. The high costs of so much time spent on issue resolution is compounded by the opportunity cost of diverting highly skilled engineering resources away from valuable work such as creating process improvements or other value-added tasks.

The key to good customer service and economic health for electric generators is high equipment availability and reliability. Other than scheduled downtimes, generation assets need to be ready and running to fulfill their missions. Despite the technical sophistication and automation of equipment, recurring problems persistently result in downtime and increased cost around the industry. The contributing factor is poor problem solving. This is reflected in the mismanagement of both equipment and people.

Plant personnel are constantly engaged in the “fix it now” mode in order to restore operations as quickly as possible. The time required to think through issues is a luxury never given. As a result, future-focus is replaced by fire-fighting problem solving. Problems that could have been prevented occur, root cause is unexplored, problems are fixed only to recur with new, related problems arising that were caused by the fix.

Problems that could have been prevented occur, root cause is unexplored, problems are fixed only to recur with new, related problems arising that were caused by the fix.

How do plants move away from this pattern? Research shows that the best performing plants also have the best organizational cultures for problem solving. Plants that adopt a systematic approach to problem solving look at the overall situation surrounding an issue to find cause, define a corrective action and consider the implications of taking that action. The focus moves from fix it now to fix it right.

Many organizations see problem solving as such a basic function of operations that they assume procedures for solving problems are in place. Most plants have, on average, 30 years or more of experience among their lead technical personnel. The comfort level with the high level of experience often leads to complacency around rigorous problem solving. Too often, both the management and technical levels rely too heavily on this experience and make inadequate jumps to causes of problems. Then there can be much time and effort spent incorrectly pursuing ineffective corrective actions aimed more at the effect than at the true cause.

The recurring problems and ineffective corrective actions faced by a heavily burdened staff are compounded by significant events, forced outages and extensions of planned outages and can give rise to regulatory concerns and cited findings. The lost revenue—revenue lost or not being generated during downtimes—is a cost of ineffective problem solving. The load on the system adds to the cost as it may require the purchase of replacement power. These costs show up visibly on financial balance sheets.

What about the costs that do not show up so easily? For example, nuclear plants that become overrun with problems must dedicate huge resources to finding and fixing problems under the watchful eyes of increased NRC scrutiny. Additional resources and time are expended on resolving these issues as quickly as possible. Yet when such efforts show success and the plant escapes the watch list, everything returns to the steady state, and the focus that just got them out of trouble is gone.

It is a challenge to get a small group on the same page; how about getting a thousand people to focus on a primary task like problem solving? Below are a six, specific issues that perpetuate mediocre problem solving:

1. Problems are difficult to define. One typical obstacle is an inability to agree on the definition or scope of the problem. While operators and technicians are the eyes and ears of the plant every day, if they aren’t consistent and accurate in recognizing, collecting, and reporting problem-related data, how can the cause analysts perform effectively?

When data collection isn’t timely, key data is lost, assumptions become facts and analysis quality is low. Problem identification and resolution efforts need a uniform systematic approach and common language for identifying, collecting and using problem data that is recognized across functions.

2. Procedures aren’t effective. Every plant has procedures that cover troubleshooting and cause analysis. The procedures tend to mirror regulatory guidance, meaning that the activities and requirements are described at a very high level. At the working level, however, personnel face confusion about what tools to use and how to use them.

Additionally, functions such as Operations, Engineering, and Maintenance often have separate procedures and methodologies for troubleshooting and root cause analysis. Cross-functional communications lose effectiveness and strength in these conditions. In the end, procedures can be more of an obstacle than a help. As a result, the focus is on filling out the form or ticking off the step in the procedure rather than the quality of the output.

3. Training isn’t linked to application. Over the years plants typically have sponsored a variety of training to wide audiences on problem solving techniques. Rather than real skill development, many of the techniques barely reach the awareness level, and rarely is there continuing training to keep skills fresh. The larger issue is that there is no clear path matching technique to application and no expectation of use from the organization. Without relevant application and support, skills are used briefly and ineffectively, leading to frustration and a rapid return to methods from the past.

4. Everything is a priority. Competition for skillsets puts a heavy squeeze on plant staffing. Most people, particularly the leads, have a lot of assignments with tight deadlines. Everything is a priority, so the level of thinking gets reduced to focusing on interim actions performed to restore conditions by the deadline. The rationale is that the issue was fixed and plant operations were restored. The question of how many times the issue has needed to be fixed falls to the bottom of the list.

At the macro level, many management teams fail to see problem solving as value-adding. Corrective Action Programs (CAPs) in the nuclear industry are a good example. Too frequently, CAP activities are viewed as administrative functions. Outcomes are measured in terms of filing paperwork within designated time windows rather than reaching a high-quality level of analysis and response. Because the CAPs are viewed as a compliance activity by management, a clear signal is sent to the staff that minimizes their engagement and rigor.

Too frequently, Corrective Action Program activities are viewed as administrative functions. Outcomes are measured in terms of filing paperwork within designated time windows rather than reaching a high-quality level of analysis and response.

5. Is it equipment and/or people? When things go wrong, it can be difficult to segregate mechanical versus human performance issues. Possibly both of these variables worked together to yield a deviation, yet rarely is there a good model in place to unravel such scenarios. Untangling these multiple issues can be time consuming and expensive.

6. There is no framework for designing and implementing corrective and preventive actions. Problem solving can be overwhelming, but the real challenge is to design and implement corrective actions and preventive actions. An effective model or framework rarely exists leading to incorrect fixes to misunderstood problems, too many fixes, or a lack of consensus on the best path forward. There is hope that enough actions will make the symptom go away. The real measure of success of any actions is an evaluation of whether or not the objectives were achieved. It is difficult to do if the objectives are not understood uniformly from the start.

Changing Course: Systematic Problem Solving and Corrective Actions

A cluster of actions can be taken to improve problem solving at nuclear plants. Benefits can accrue quickly, creating an ROI that justifies the investment in time and money. Taking any of these actions can lead to improvements but taken together in a change initiative, they can be transformative, improving reliability, maximizing resources and reducing costs.

When issues are identified before implementation and staff efforts are more future-focused high-value engineering resources can focus on problem solving. Budgets become more predictable and costs associated with problem resolution go down.

Integrate problem prevention into daily activities. The best way to solve problems is to not have them in the first place. Management must lead the culture change toward this goal. To capitalize on problem prevention and increase equipment reliability, set aside time and establish measures and responsibilities designed to avoid problems. For every actual problem, questions should be asked regarding how to prevent such problems in the future. For every activity – from daily tasks to planned outages – planning must include problem prevention considerations.

Benefits: Problem prevention maximizes operational effectiveness, increases equipment reliability, and improves outage performance. When issues are identified before implementation and staff efforts are more future-focused, high-value engineering resources can focus on problem prevention instead of problem solving. Budgets become more predictable and costs associated with problem resolution go down.

Capture changes systematically. If a performance is on target and then a deviation occurs, by definition there has been a change. Problem solving efforts focus on finding the change related to the true cause of the deviation. Power plants make changes every day. Most of them achieve their expected purpose. However, some changes result in new problems. The key is to capture the changes and tweaks to equipment and programs in a systematic way.

Benefits: Significant time and effort are saved by understanding why deviations may have occurred. Going forward, people will learn how to troubleshoot potential changes and then better avoid problems that might be caused by a change.

Gather and organize information effectively. When deviations actually occur, a rational framework for gathering and organizing information allows problem solvers to evaluate problems effectively. Plant personnel need to be provided with a specific set of questions to ask as well as information to collect regarding observed deviations. Lack of accuracy, missing information, assumptions, faulty conclusions, etc. will cause the entire process to perform at lower standards. A similar framework that balances “fix” objectives with possible risks helps troubleshooters evaluate among possible corrective actions.

Benefits: Only through good problem identifications can apparent cause and root cause analyses be accomplished effectively. This is the basis for any further cause evaluation. In addition, the early accuracy of problem identification is essential for correct extent-of-condition searches. Further, trend analyses will see greatly increased validity. These frameworks support plant performance and enhance regulatory relationships by detailing the data and thinking behind problems solved and actions taken.

Clarify the equipment/human performance/organization distinctions. Only through a clear understanding of the interface between people and equipment can real issues be resolved. Workable distinctions need to be clearly recognized, and analysis methodologies and the capabilities to apply them must be put in place.

Benefits:  Key personnel that often go unrecognized can help drive the right performance rather than causing poor performance. It’s not enough to reference “operator error” and “weak procedures” in the quest for cause.

The ideal state is to prepare a cadre of personnel with good leadership and communications skills and keep them engaged so that skills stay fresh.

Develop expert cause analysts. Problem solvers benefit from expert leadership and coaching when trying to understand and resolve deviations. The ideal state is to prepare a cadre of personnel with good leadership and communications skills and keep them engaged so that skills stay fresh. In addition to providing a troubleshooting facilitator with the appropriate tools, giving them the time and responsibility to participate is a necessity.

Benefits: When problem solving facilitators are provided, apparent and root cause analyses are performed in a consistent manner, are of higher quality, and are produced more quickly. The critical thinking skills of the entire organization are improved from senior leadership down through operators.

Define a structured approach for fixes. Corrective actions are where results happen—and where resources are consumed. At the end of the day, it is in this mode that asset optimization is achieved. Corrective action thinking has to have a place in the strategic planning cycle and has to be balanced with safety and production concerns.

A framework for corrective actions that is shared by management and technical resources will help develop paths of action that satisfy safety, production, and financial goals. Further, an element of project management can be applied to implementation planning early on, which will help manage the path forward.

Benefits: This approach eliminates the discrepancy between technical objectives for a solution and what management wants; and it eliminates time and effort wasted on ineffective solutions. Alternatives are rationally selected that meet formal objectives with manageable risks. A clear line connecting problems and fixes is established. Best of all, the number of fixes becomes more manageable. Preferred alternatives are documented and easy to measure for success. This application of valid measures will prove the strategic value of the Corrective Action Program.

Beyond Compliance with Effective Change Management

Moving beyond the current compliance mind-set, to a strategically-focused, effective, problem-solving capability maximizes equipment reliability and provides a positive cost-benefit. Once in place, the noticeable culture change moves from the daily focus on what’s broken, to having hundreds, and perhaps thousands, of hours of staff time available to head off problems and promote system health.

Installing a systematic problem-solving process is not unlike the implementation of any other lasting process change. Sustainable change implementation is focused on four, key concepts as illustrated in the Figure 1: Implementation Model

The problem-solving process is a key business process for any organization. Improving this process requires a tight description of desired staff responsibilities and cross-functional information handoffs along the problem solving/corrective action flow. Process tools and procedures that are embedded in the process and measures that allow senior management to monitor and evaluate the process must be included. Within this framework, skill development needs can be identified and provided.

At each responsibility layer, from problem identification through development and implementation of the fixes, expertise to resolve problems, ranging from simple to complex, has to be developed and maintained. In addition, there must be clear expectations for performance. Problem solving efforts need to be monitored with pinpointed feedback designed to support and improve problem solving skills. The balance of consequences for taking the time to solve problems must be encouraging, not punitive. Of specific note, it is critical that the supervisory level is prepared to support change and that second and third-level managers support first level supervisors in making changes. Implementation is where initiatives typically fail. Management wants quality, but the message heard at the supervisory level is production. A balance has to be achieved, and it can only be achieved with supervisory involvement.

Too often expected change is launched by management directive without the necessary support. People need some time and practice to develop confidence and competence under expert coaching. This phase is particularly important at the beginning of any change initiative, especially when applying new skills. The goal is to move the performer to proficiency and thus a reduction on the need for coaching support.


The factors that influence an organization’s problem-solving capabilities extend beyond simply giving people the right tools. If the reasons for sub-par problem solving efforts are unclear or improvement efforts have been made from many angles with limited success, an independent assessment may be needed to identify weaknesses and provide a structured plan for improvement. Once there is a clear understanding of what’s not working well, a pinpointed approach can address the broken piece(s) of the puzzle.

Change needs to be implemented properly so that it is permanent and sustainable. Once systematic problem solving is in place, with skills learned and problem-solving responsibilities supported by management and on-the-job coaching, then across-the-board improvements in plant performance are observed, regulatory relationships are improved, and an organization’s position in the marketplace is advanced.

Christian Green is the Delivery Excellence Manager for Kepner-Tregoe North America. He devotes most of his time to operations analysis, business process improvement and strategy development. He is a master at facilitating issue resolution through the use of Kepner-Tregoe’s renowned problem-solving tools.

Founded in 1958, and based on ground-breaking research regarding how people think, solve problems, and make decisions, Kepner-Tregoe provides a unique combination of training and consulting services to improve quality and effectiveness while reducing overall costs. The KT methodology is used at every level of client organizations: to implement strategy, achieve continuous improvement, increase customer satisfaction, and drive effective issue resolution throughout the organization.


When a Power Plant Shuts Down – KT Processes Kick In

Oil Refinery Avoids Shut Down Through Structured Root Cause Analysis

Contact Us

For inquiries, details, or a proposal!