Optimizing Shutdowns, Turnarounds, and Outages

The traditional view of operational Shutdowns-Turnarounds-Outages (STOs) holds that they are maintenance and engineering events; this simplistic view is held by many organizations. A more realistic and holistic perspective, however, recognizes that the impact and scope of STOs extend far beyond the maintenance and engineering functions. STOs can command significant capital and operating budgets. They attract the attention of shareholders and boards of directors, impact inventory supply chains and customer relationships. They are therefore ‘whole business events’, not simple function-specific ones.

Considering all the potential ramifications, well-executed STOs can represent a source of competitive advantage for an organization. They can drive commercial performance, boost morale, bring recognition to high-performing teams and accelerate individual careers.

The corollary to this picture of success, of course, is that poorly executed STOs can cost an organization millions of dollars in lost revenue, drive up operating costs, and cause permanent damage to the careers of those involved. This has long been true, but is now amplified in the current operational environment in which most organizations are operating with reduced workforces and resources. Simply put: in today’s leaner and meaner business environments, STOs represent not only an increasingly significant challenge but also an increasingly significant opportunity.

The STO consists of the following phases of activity:

  • Detailed planning and organization of the work involved
  • Removal of assets from production
  • Inspection and work execution, product changes, repairs, improvement activities or a combination of these
  • Restart of the asset/unit/plant and restoration to ‘should’ performance levels

STO work is usually—but not always—recurring or cyclic in nature. An STO is unique in that it always involves the plant, unit or asset being taken offline or out of service. An STO is not considered to be complete when the individual work packages are completed; an STO is complete only when the asset, unit or item is returned to service and performing at the desired level.

STOs are more complex than other project-based events. Quite simply, they involve both planned activities and unplanned work resulting from inspection of part of a machine or asset which are not accessible or visible during normal operations. The potential for identifying previously unforeseen or emergent work requirements discovered at inspection that must be performed within the defined time constraints of the STO, adds the requirement of rapid trouble-shooting and decision-making capabilities.

Take a moment to consider your organization’s current approach to conducting STOs. Is there a significant reliance on knowledge and experience? Are one or two team members considered critical to STOs because “they were there the last four times and know what happened”? All too often, the execution of a STO (with all the dependencies and ramifications we considered at the outset of this paper) rides on one or two highly experienced ‘hero’ employees ‘stepping up’ to solve the problem or ‘get it done’ during STOs. But given the demographic shift that is now upon us, many of these individuals are destined to leave the workforce in a relatively short time. This of course is in addition to the everyday issues of absence due to sickness, transfers to another part of the business, or career advancement with another company. If the knowledge and experience previously relied upon are no longer available for any reason, all the business issues which depend on the STO are in jeopardy. The challenge and opportunity then is to adopt a replicable, reliable, process-driven approach to STO management which harnesses—but is not totally dependent upon—team members’ knowledge and experience, and enables easy knowledge transfer from one person to the next.

In KT’s experience, the top challenges in managing STOs lie in the following critical areas:

Ensuring workforce safety whether they are employees or contractors is the number one priority for the STO management team. STOs present numerous challenges for safety. Large numbers of contractors may be working on site for the first time with little knowledge of equipment and processes. Employees will carry out many tasks which are not routine and only occur in the STO situations. For example; cleaning, inspection and repair will often be carried out with special isolation requirements in confined spaces or other challenging environments.

The development, deployment and communication of an effective STO process which is clearly understood by all stakeholders, and which navigates all concerned parts of the organization through the complex challenges presented. Too frequently, the STO process is unclear, fragmented and not shared. Without a guiding framework, the coordination and execution of the complex tasks involved become extremely difficult. Many departments may need to plan inventory or resources if they are to be impacted in any way by the STO. A lack of general coordination is also compounded when the absence of a common approach results in a myriad of different methods being used, making coordination and communication close to impossible.

Managing project scope creep is typically one of the top challenges for most STO management teams. It is a particular issue in STOs where inspection is only possible when the process or asset is the STO (e.g. opening up a furnace to establish the amount of re-lining required). Managers need prioritization tools to help them make better decisions on managing emerging work to stay within plan and budget targets. Without such prioritization tools, STOs can quickly experience scope creep leading to other work cuts from schedules which can have detrimental effects on operational performance after recommissioning.

The capture, analysis and availability of relevant information and metrics via management information systems will enable the appropriate managing of activities and identification of future improvements. Measuring the right things, the right way, at the right time—and communicating them appropriately— allows STO leadership to maintain control of the diverse range of activities when work is being executed. While poor planning is usually blamed for cost and time overruns, if problems continue in future STOs, they are very often symptomatic of the absence of a good measurement and control system. This absence will either hinder—or totally prevent—the organization from understanding and learning from the problems it experiences.

The existence of business processes which do not support the needs of the STO. We know from experience that organizations should continually evaluate (and if necessary adjust and align) their business processes in order to remain competitive, and that misaligned processes will cause inefficiencies. In most organizations, business processes are designed to enable normal day-to-day activities. They are generally not designed to cope with major peak loads, special cause events, and the other unusual demands that an STO places on them. A key opportunity for improvement in STO effectiveness lies in reengineering basic business processes so that they can accommodate the needs of an STO and its related potential emergent work requirements.

Cost management and control in executing complex STOs. The existing reporting and control systems do not provide STO budget performance data until sometime after the STO is completed. The STO requires a cost monitoring program that provides timely data throughout the STO enabling those controlling activities to make more informed choices on course of action.

The coordination and management of complex resources. STOs—and particularly larger ones— typically involve technical staff, corporate engineering, specialists, vendors, contractors, government bodies (safety, environmental, etc.) alongside internal employees, whom possess varying degrees of knowledge and experience. It is not uncommon in some operational environments for the number of people on site to grow by 300% when contractor resources are used to assist with STO execution. This puts a significant load on processes such as induction, isolation training, material supply and equipment procurement. Even if the internal team is generally experienced, an STO can still involve individuals in roles and carrying out key duties which are new to them. Without clear communication and management protocols, our experience has shown that up to thirty percent of the working day can be lost waiting for adequate instruction, or searching for a resolution when a problem occurs.

Transforming an organization from reactive to proactive. Shedding a reactive culture and moving towards anticipating and resolving issues before they impact is also critical to STO success. Every organization has a hero or two—people who are remembered for ‘saving the day’—and the individual’s reward for this kind of heroism can be great on many levels (job security, advancement, financial incentives, recognition, self-actualization). The problem is that heroism is only required when the organization is already in trouble. How many personnel are rewarded and recognized for the arguably more valuable heroism of thinking about and preventing things from going wrong? This is perhaps the most essential component in executing a successful STO. The few most efficient organizations we know have already replaced the “go and do it” mantra with that of “go and think it through properly first”, and have appropriately adapted their emphasis in terms of performance and recognition systems to encourage this type of behavior.

Managing the expectations of diverse stakeholders. As noted previously, STOs are more business issues than engineering events; yet, in many organizations, indirect stakeholders are rarely involved in the outage management process. However, one thing is certain, their voices will certainly be heard if restart is problematic, or if supply to market becomes an issue. A key skill requirement, and often a skill gap for today’s STO leaders, is the engagement of key stakeholder groups early in the planning process. This enables adequate communication of the likely risks and consequences of the STO, and keeps all impacted personnel informed during execution so they can plan their own areas of operation accordingly.

Optimizing the STO Process

STO optimization requires a holistic approach to managing the entire set of complex activities and relationships which exist in the Shutdown-Turnaround-Outage (STO) process.

To do this, several elements should be considered. First, a clear and common process framework must exist for the STO activity. Secondly, the processes that drive the flow of information and activities within the STO framework must be aligned and efficient.

There are three primary phases that exist within a typical STO model, they are outlined in Figure 1 with sub-elements within each phase:


The Definition Phase ensures the identification of major sponsors and customers at the initiation of the STO. It also provides communication channels for the business, business units and support functions to prepare the organization for the STO. The most suitable time frame for the STO is determined by gathering data on the operational process, customer requirements, equipment needs, resources and other constraints. Once this information is processed, the Definition Phase drives decision-making activities in the ‘charter and scope’ elements of the STO, where detailed objectives and boundaries for activity are defined. It should address the protocols for scope freeze and change control to ensure they are supported by the organization. Without such protocols, achievement of time and cost goals is practically impossible.

The Definition Phase leads the team through the detailed processes of defining work activities, determining work packages and the resource requirements needed to carry out the work and performing primary risk assessment activities. It should also consider defining work in areas of STO tasks such as the decommissioning or removal of the asset from production. This can have a major impact on restart effectiveness, and hamper the organization’s ability to synchronize restart with activities in other critical areas. The restart process itself is an area often under-planned and underassessed by STO teams, which in turn leads to poor production efficiency for extended time periods after units are returned to production.

On completion of the definition phase, the leadership team will have the first indications if the goals and objectives set for the STO are attainable.


The Planning Phase is primarily concerned with the organization of the STO activity. Key activities take place to ensure resources are available to execute the work packages that make up the STO activities. Responsibility assignment should consider adequate levels of knowledge, skills and experience of assigned resource bottlenecks where people, but not required skills, are available.

At this point, tasks can be sequenced and scheduled to confirm the viability of the STO duration, and if the resources identified are adequate to complete the work identified within cost constraints placed on the STO.

The planning activity also focuses the organization on two critical but frequently understated aspects of the STO — aligning business processes to facilitate STO Implementation and developing metrics and measurement systems to track a balanced set of indicators to drive future improvements.

All organizations have business processes guiding the daily operational activities. The organization may not be completely satisfied with such processes but they do for the most part exist. The issue for the STO is that the operational processes that are in place often times are incapable of dealing with the incremental load placed upon them by key STO activity; such as induction, contractor management, procurement vendor payment, cost control and reporting to highlight a critical few.

The STO team must carry out a process review to establish the robustness of key business processes. If necessary, re-design and additions should take place so that the processes enable a more effective STO. Often, if this action is not taken, upon post-STO review it is discovered that these same processes have negatively impacted the STOs implementation phase.

Metric and measurement systems are a component that must also be addressed. These key systems are the providers of information for management decision making, control and the identification of areas for both recognition and improvement.

Metrics are an area that are often under-exploited in the STO. Generally, most outage metrics are limited to the single dimension of performance covering time, cost and the achievement of overall STO objectives. While these are clearly the fundamental success factors, other important types of metric “families”—process, people, promotional, and political which may benefit the team and activity—are often overlooked.

However well selected the metrics, to be of value they must be monitored by an effective measurement system that has the appropriate frequency and has clearly understood protocols for escalation and feedback. Good metrics are often devalued, if not nullified, by the inadequacies of the measurement system.

Prior to the commencement of STO implementation, a final round of risk assessment should be conducted on the interfaces between groups. This step ensures that planning and risk assessments done at a functional level can form part of an integrated STO plan. The process considers interfaces for master schedules, resource leveling, resource conflict, responsibility assignments, communications, and issue escalation. At the same time, all pre-STO work is checked for completion so that there are no surprises when the Implementation Phase begins. As the STO progresses, any emergent work will consume precious time and resources. These up-front risk assessments pay dividends in the long run by keeping emergent work to an absolute minimum. Tactical modifications of the plan ‘on-the-fly’ will of course be necessary to ensure on-time finish, but can be minimized with adequate focus and preparation.


However efficient the design and flow of the Implementation Phase, its ultimate effectiveness is dependent on the outcomes of the Definition and Planning Phases.

The Implementation Phase provides a process to ensure the work that has been organized gets done. It is specifically concerned with the mobilization and management of resources and the monitoring of activities to ensure that they accomplish STO results to the standards required in a safe and proper manner. The Implementation Phase establishes the standards of behavior required to deliver the STO objectives. On a daily basis, teams must communicate, resolve issues and provide updates among themselves and the management team. The visual representation of the performance, schedule and cost allows close tracking on the progress of the STO, as well as the surfacing of additional issues. This will help to ensure that the information needed for effective problem solving, issue escalation, and decision making is available where required.

As STO work packages are completed, and the asset or plant nears reinstatement to operational status, modifications to the restart plan based on lessons learned from emergent work should drive a review of the plans. Risk assessment also needs to be conducted on any modifications before attempting to restart the plant so that potential problems are adequately addressed.

A key element in the Implementation Phase is the close monitoring and reporting of the restart activity itself—and this is vital to the timely resumption of operations. Formal equipment acceptance and handover from external vendors and the STO team must be carried out to make sure potential problems are addressed and desired value is delivered.

When the restart has been completed, the team inputs the required data into the communication value stream to ensure learning capture and subsequent continuous improvement. The STO team can then be released for redeployment to its next project.

Tie That Binds

Our experience has taught us that the model described is extremely robust. That same experience has also taught us that to be effective as a holistic approach, connectivity—linking the critical elements in each of the three primary areas—must be present through the design and implementation of a communication stream which flows through the model. Without such a communication stream the organization runs the risk of:

  • Having a tendency to look at the optimization of each component block rather than the process as a whole.
  • Overlooking the complexity of interconnections between the components that can lead to gaps or white spaces in process or organization.
  • Having different module responsibilities and ownership leading to inconsistencies in standards in the implementation of the STO process.

The development of the communication stream provides constant information feedback loops among the key elements of the STO process model, and also facilitates closeout and review activities to gather and process information on lessons learned. The communication stream ensures that information flows effectively for the duration of the STO. Stakeholders and performance objectives will require metrics and measurement systems which ensure that implementation activities remain on track to deliver the goals required of the STO. These monitoring frameworks will use dashboards and other visual aids to create visibility for all stakeholder groups, and will promote active communication and discussion.

At closure, the objectives and deliverables for the STO are reviewed to determine whether performance and stakeholder expectations have been met. Lessons learned from all STO stakeholders, including contractors and vendors, are documented and codified for future reference.

What type of results have been seen using KT’s approach to STO management?

Development, deployment and communication of an effective STO process:

Having a defined process that STO stakeholders and the STO team understand—and can follow— minimizes the issues that organizations often experience at restart once STO work packages have been completed. The graph for Changeover Performance on following page shows how an international display manufacturer improved its restarts over a number of the STOs simply by using a visible process. There was a 318% improvement in output for the first 24 hours of production at restart.


Capture, analysis and availability of relevant information and metrics:

In the highly time-constrained world of an STO, having accurate and up-to-date information is key to making the right decisions. A dashboard that was used at a long products steel mill displayed metrics on progress among various teams, overall progress, costs, safety audits, extra work and other indicators. When the daily dashboard was distributed to stakeholders, it enabled sponsors, STO managers and team leaders to stay in close touch with the progress and performance of the STO. The STO completed all planned work within the scheduled period, and did not record a single case of lost time or medical injury.

Overcoming business processes which do not support the needs of the STO:

An with an international mining company necessitated the creation or modification of a number of business processes that impacted STOs. By reviewing these processes—including Contractor Management, Procurement, Predictive Maintenance, Preventative Maintenance, Business Improvement, Lockout—Tagout, Reliability, Permitting and Compliance, among others—the organization was able to improve its ability to scope the STO, start work packages on time, and reduce crew waiting times. The impact was significant. Lock out/Tag out and parts issues fell by 75%, and equipment failures and safety incidents decreased by 30% as shown on the % Issues vs Prior Shutdown chart.

Coordination and management of complex resources:

A major oil company underwent a turnaround event in its Singapore refinery which, in excess of US $200 million, was the biggest and most expensive it had attempted worldwide. It involved specialists and sub-contractors with whom the organization had no prior working experience, and equipment which had never been used before. As they neared the event, staff and sub-contractors found that they did not know who to approach for resources, technical support, and issue resolution. In other words, there was no clarity on ownership of the work packages. Was it the asset owner, the main contractor, or both?

Establishing an effective communication process in this stage entailed, first; specifically defining roles, such as Approve, Lead, Support, etc., for each major deliverable. Then, through a series of communication sessions, assignments of responsibility for these roles were made among the asset owner, main contractor and sub-contractors. The result was clear ownership, and resources who knew where to go when they needed specific types of support.

 Overcoming a reactive culture, moving toward anticipating and resolving issues before they impact:

A major building materials manufacturer ran five days over its planned STO. The next year, using the KT STO management process, an STO planned for thirty days was completed four days early. The reason they saw this improvement? The detailed risk management allowed the company to identify and prepare for issues before they occurred. Understanding the links between work packages, the resources used, and surrounding activities enabled the STO team to look at more than the list of tasks to be completed. The data collected also allowed them to more accurately plan their future STO.

Managing the expectations of diverse stakeholders:

By managing the expectations of the organization’s diverse groups, STO improved greatly at a high production concentrator. If perception really is reality, the STO management needs to not only deliver, but be seen to deliver. In this case, all key management personnel—from the General Manager through Corporate to Operations through to key vendors—expressed confidence that the impact of the STO management had been better than ever before. VPs spent time in other areas of the organization; Capital Expenditure Authorizations were approved; Operations had confidence in Scheduling; Contractor and Vendor idle time decreased. STO quality was shown to have increased overall by 60% (measured by on time and on scope), while cost was shown to have decreased by 40% (cost of run STO). Key to these results was providing a common language, process and understanding of both, and managing stakeholders’ expectations. Results and Expectations are two different areas of managing the performance of people; this means it is critical that key leaders have a shared understanding, and good information on both what is of value, and what will be delivered.

About Kepner-Tregoe:

Founded in 1958, and based on ground-breaking research regarding how people think, solve problems, and make decisions, Kepner-Tregoe provides a unique combination of training and consulting services to improve quality and effectiveness while reducing overall costs. The KT methodology is used at every level of client organizations: to implement strategy, achieve continuous improvement, increase customer satisfaction, and drive effective issue resolution throughout the organization.


11 マニュファクチャリング・エクセレンスを阻むもの