Shift Left? No, ‘Shift Down’ for Services Support Success

How shift-left development practices can be successfully applied to your customer services and support organization

Much has changed in the landscape of development, developer tools, and methodologies over the last decade. More organizations are shifting from “waterfall” to an Agile development framework as well as DevOps principles for developing and supporting enterprise applications. The goal is to drive faster release cycles, improve quality, and deliver an overall better experience to your customers—the users of your systems and applications.

In particular, a philosophy called “shift left” has emerged as a new way to improve the quality of applications by moving testing cycles closer to development activities. This has proved to reduce the number of defects found in production—and saved tens of thousands of dollars for organizations that have implemented it.

In this white paper we examine how the concepts accelerating the move toward shift-left development frameworks can be applied to your service and support organization. We call this “shifting down” because it involves putting more responsibility “lower” in the organization or at the front-end, if you want—empowering your Tier 1 workers as much as possible, while also doing self-service automation as appropriate. We give you seven best practices on how to do this—and show why not doing so could be costly to your bottom line.

Shift-left testing: a primer

Let’s start by explaining the shift-left philosophy in software testing. To shift left means to begin testing earlier in the software delivery lifecycle. This is as opposed to the traditional approach of handing testing off to a dedicated QA team at the very end of the development process. The reasoning is simple: the earlier in the development process you can find bugs and defects, the sooner you can give feedback to your developers, allowing them to be more productive in their work. Currently, more than four in ten development organizations have officially shifted left in application testing.

Currently, the majority of organizations still largely depend on the “waterfall model,” both used in project management and software development, in which requirements gathering, design, coding, and testing are done in a mostly sequential manner (see Figure 1).

The chief challenge of using the waterfall method is that bugs or defects—whether major or minor— are identified only after development is complete. Minor defects are not necessarily show-stoppers—developers can quickly fix them without messing up delivery schedules or adding too much cost—but major bugs are a different matter altogether.

In such cases, there can be delays releasing the product to the customer, and costs can escalate as more people are brought in to fix the problem. The only time having the testing phase at the very end of the software development lifecycle is ideal is when your product is bug free—but dream on if you still hope at the beginning of each product development initiative that this will be the case.

As more flexible software development models such as Agile emerged, enterprises began realizing that handing testing over to a siloed QA team in a one-off activity was in fact very risky. Defects found at that stage were the No. 1 reason for release delays and cost overruns.

Shift-left was born, where testing is done at each phase in development
(see Figure 2).

Shifting left basically means shifting testing to the earliest point in time possible (to the left on the lifecycle axis) and doing it continuously. Shift-left practices involve integrating your testing into your software development process and uncovering bugs earlier when they are easier and less expensive to fix.

Although the practice emerged in the late 1990s, the concept of shift left was formally named by Larry Smith in Dr. Dobbs Journal back in 2001, where Smith explained how integrated development with QA at lower levels of management can expand your testing program while reducing manpower and equipment needs. Testing, feedback, and revisions happen on a daily basis in shift left practice. This promotes agility and lets the project team scale their efforts to boost productivity
(see Figure 3).

When DevOps entered the picture in 2007, shift-left was an integral part of it. DevOps is when you combine software development and IT operations, creating more collective end-to-end ownership of the process, the product lifecycle and the customer experience. The goal of DevOps which is to shorten the software development life cycle and provide continuous, high-quality delivery of code, has at its heart the shift-left philosophy.

The very real benefits of shifting left

A significant body of research has identified the advantages of shifting left. A study by Pulitzer Prize-winning IT consultant and author James Martin found that the root cause of 56% of all defects identified in software projects are introduced during the requirements analysis and definition phase, 27% in the design phase; and only 7% during the development phase. S.A. Kelkar of the Indian Institute of Technology in Structured Systems Analysis and Design verified these numbers, which were further confirmed by STBC in The Economics of Testing. (See Figure 4).

Research from the IBM System Science Institute found that if issues are identified early in development, they cost approximately $80 to fix. But the same problems cost $8,000 to fix if detected during production—100 times as much. (see Figure 5).

Figure 5: Cost of detecting defects in various phases of the software development lifecycle.
Source: IBM System Science Institute

Applying shift-left to customer services support organizations

Now that we understand the concept of shift-left, we can apply its principles—and reap the benefits of them—by applying them to our IT customer service and support organizations.

Today, IT support teams services are prioritizing revamping their traditional frameworks for supporting customers with new processes based on established and new IT management frameworks like the IT Infrastructure Library (ITIL®), or Control Objectives for Information and Related Technology (COBIT). Chief reason for prioritizing IT customer support? They are finding their customers increasingly focused on the actual availability and performance of increasingly cloud-based platforms and software applications that are under full control of the vendor.

The challenge facing support organizations is that an increasing volume of incidents puts them in continuous fire-fighting mode. Proactive methods, such as trend analysis, preventive actions and post- mortems of major issues are effective ways to decrease the number of support requests. Unfortunately, IT functions often focus most of their resources on reactive activities and ignore proactive approaches despite the obvious benefits. As a consequence, recurring incidents lead to customer frustration and unnecessary downtime.

Many parallels between service support and software development

Just as shift left was adopted because coding issues are more expensive to fix the later they are caught, service operations are in a similar situation. Take, for instance, an incident that was first worked on by a Tier 1 technician, before being raised to Tier 2, and after some time finally requiring a conference call of Tier 3 subject matter experts (SMEs) before it could be resolved. The resulting cost of each n+1 resource added can easily add up to 5-digit and 6-digit cost figures.

This idea of diagnosing and solving problems at the front-end of the process rather than waiting for a major incident to occur mimics the findings that led to the shift-left revolution in Development. Indeed, if the Ponemon study on costs can be applied to service Operations, by solving earlier in the process, it could save up to 100 times that cost.

We call this “shifting down.”

Why down? Because you are, in effect, shifting responsibility down the organizational structure, so that Tier 1 and Tier 2 support people have more knowledge, more capability, more responsibility, and are empowered to act on behalf of the customer.

Current services support practices

When a customer (or employee, or other type of user) calls support, they get a Tier-1 engineer who has only minimal skills and knowledge. For slightly more complicated issues—“I’m having trouble accessing our sales system,” many Tier 1 support organizations only capture basic information, try to assess priority and escalate to Tier 2. The Tier 2 engineer, in a second call, asks more technical questions (often product-centric) and (hopefully) can solve the issue. Otherwise it gets escalated again, to Tier 3, where the real product experts sit.

Even small changes can make this process more efficient. By shifting responsibility down—by training Tier 1 employees to ask a few critical diagnostic questions with respect to symptoms, impact and the actual deviation in performance, even if they are not product experts—you can resolve more issues right at the first touch point or, at a minimum, provide much better information to Tier 2 for faster resolution, less “customer-support ping-pong” and ultimately fewer escalations.

To train your Tier 1 people to actually resolve more sophisticated issues than, say, password or router resets and provide basic triage functionality, will return significant benefits.

The shift down philosophy

The goal of shifting down is to enable and empower less technically experienced resources to solve more problems earlier in the life-cycle coupled with proactive problem management to avoid recurrence, and generally moving interaction points closer to your customer to reduce escalations, improve first-time fix rate, reduce cost of service, and overall improve customer experience.

An example of shifting down is to build some core diagnostic questions into your self-help portal. With one client we have seen this approach increase case deflection by 35%. This also enables your Tier 1 organization to operate at a more informed level and to begin the actual troubleshooting earlier in the process.

When embarking on a shift-down program, you must ensure that you align workflows, roles and responsibilities accordingly, as well as your performance system for how these engineers are being supported and rewarded, including metrics.

A new era for IT service management

The IT Service Management (ITSM) function is truly in the midst of transformation.

A study by Forbes Insight and BMC found that most IT service organizations are evolving beyond merely focusing on IT-centric services. Instead, they are now seen as mission-critical teams that are on the front line ensuring that the customer experience is what it should be in the new digital age.

Among other key findings of this study:

• 56% say the pace of IT change or transformation is accelerating “significantly”
• Only 13% see ITSM keeping its same traditional organizational hierarchy
• 36% say the lack of adequate IT skills tops the list of challenges to achieving stellar customer support, followed closely by budget constraints
• 37% report that the majority of their entire IT budgets go to ongoing maintenance and management

Bottom line: 75% of IT executives agree that the amount of time, money, and resources spent on ongoing IT maintenance and management—which includes IT services support—is affecting the overall competitiveness of their businesses.

Improving both the quality and cost-efficiencies in IT support is therefore a strategic IT initiative, up there with cloud, big data, and mobility. Indeed, a majority of executives, (56%), indicate that IT service management is “extremely important” in their enterprises’ cloud computing as well as big data initiatives. Fifty-four percent also indicate that ITSM is “extremely important” in supporting their mobile computing efforts.

Seven Best Practices in a Shift-Down Program

For those who would like to try shifting down, here are seven best practices we’ve put together from our experience working with IT service organizations.

1. Gather data

To start your journey, it is essential to establish a baseline. Who is contacting your frontline? What kinds of issues are they reporting? What kinds of questions do they have? You need to gather data on your incoming call volume such as call type (incident, problem, question, request, follow-up), most frequently escalated call type and asset categories, the cost of servicing these calls (not forgetting labor and time), and finally, the skill level of the technician(s) handling each call.

For example, you may learn that of the one thousand calls you get in a month, five hundred of them are asking for a password reset, two hundred and fifty of them are asking for fairly low-complexity solutions such as granting VPN access, and the final two hundred and fifty are asking for high- complexity solutions such as figuring out why a database may be down. That will impact your strategy going forward.

2. Select high-value targets

Based on your analyses of your data, you will then seek high-value targets for shifting down. Which systems or assets are being frequently escalated, and have long resolution times and therefore very high costs associated with them? Which ones are being escalated but are easily resolved? Which ones seem to have very little impact on customer satisfaction or cost?

These are all candidates for shift down.

Take the previous example of identifying high levels of password reset calls. Among other things, this would mean that half of all support calls could be easily solved by automation, which is the ultimate shift-down move.

And maybe you could drill down a little more. For example, if you realize that most of your high-complexity calls are coming from Europe, you could give your European customers a support number that goes directly to Tier 2 support. That way, you wouldn’t waste your and the customer’s time—and money—of bringing in a Tier 1 technician. They can focus on what they do best. And the more you know about the other kinds of calls, the more accurately you can route those as well. The overarching theme here (again) is that you need to understand your customers, and understand your call patterns.

3. Empower lower-level employees

Kepner-Tregoe believes in enabling employees and managers in the fundamentals of effective problem-solving and decision-making. We call this critical thinking, or “rational process”.
The Kepner-Tregoe critical thinking approach has made lasting contributions to business management for over six decades.

Indeed, the word critical thinking has a very specific meaning for Kepner-Tregoe. It is a pattern of logical, iterative thinking, driven by a series of questions, aimed at retrieving, organizing, and analyzing information for the purpose of reaching a sound conclusion.

Kepner-Tregoe’s belief is that good teamwork comes from training workers to consciously leverage the same way many of them already think unconsciously.

They can do this by addressing four fundamental questions:

•   What’s going on?
•   Why did this happen?
•   Which course of action should we take?
•   What lies ahead?

Pushing more technical know-how lower into the organization will only go so far. Especially with the shelf-life of technical knowledge getting shorter and shorter, it is simply not a scalable approach. Therefore, you also need to train your engineers, especially at the frontline, in proper questioning techniques, the fundamentals of forming a problem statement, basic data gathering, troubleshooting, and critical thinking.

You’re not only teaching people at lower levels how to ask the questions that a more experienced technician might ask, but you’re also enabling them to understand how to clarify, verify, and respond to customers in those situations.

4. Establish escalation triggers

Escalation is still very important when shifting down. Based on the data you’ve gathered and analyzed, you need to establish clear triggers for when a customer issue does need to be escalated. The reason for this is simple: it’s one thing to empower a team, it’s another to overwhelm them. Your newly trained team must learn how to determine when it’s time to let go and ask for help.

Most commonly, triggers are based on a time parameter. If a Tier 1 rep has spent an hour on a problem without success, it may be time to escalate to Tier 2. Triggers can also be based on fix attempts: if you’ve tried three times to fix an issue and they all failed, then it’s probably time to escalate. Escalation can also be based on the customer’s attitude: if they’re getting upset with you, perhaps using strong language, that might be a trigger.

5. Test your shift-down program

The next best practice is to set up A/B testing in your customer support center. Divide your team into two groups: one group uses their new shift-down techniques—typically on a very specific issue, like password resets—the other handles business as usual. Which group returns the best results?

Why do this? It’s good management science practice. For every good business program, every good initiative, regardless of what type, you should follow Six Sigma, or a similar quality plan to treat projects the way pharmaceutical companies treat the drugs they produce. Before implementing a new practice companywide, it should be tested and proven to deliver the results that you intended.

In this particular case, what you are doing will directly impact your customers. To truly put customers first, you must make sure that anything you change really does improve things for them.

6. Quantify Results

You now need to measure how you did. Did the target team get more calls solved in less time? What was the cost of service by asset, by client/customer, by request type? At this point, you want to quantify how much money you saved versus how much you spent, so you can calculate the return on investment (ROI).

Armed with that knowledge, even a low-level manager who ran an experiment on their small support team can go to their boss and say, “Look what we did in our team in Canada. Imagine if we replicated this across North America.” Suddenly, the whole company is performing customer service support in a more professional manner, and we’ve set up that low-level manager for career success.

In the example we’ve used, of the 500 out of 1,000 support calls being for password resets, this ROI analysis has the extra advantage of telling you how much you can invest to solve the problem. If you can save $2 million by preventing 500 calls a day from ever reaching you by investing $100,000 in a password reset tool, that’s money well spent.

7. Scale your program

When you are satisfied with your results from your first attempt at shifting down, then look and see what the data is telling you about other shifts down you can perform and where else to apply the process. Start over with Best Practices 1 or 2 to improve another area of your customer services support business.

The Best Services Support Metric: Customer Satisfaction

Ultimately, IT support is perhaps the most important enterprise function of the digital age and cloud-based applications. As organizations grow increasingly dependent upon technology, having it work as needed, when needed, supersedes almost everything else. For firms of all types and industries—manufacturing, automotive, retail, financial—IT service management can play a pivotal role. Every service interruption directly impacts company productivity, and ultimately the success of the company—whether that success is defined as profitability, efficiency or faster service. The shift-left techniques developed over the last 20 years in the software development arena can have a profound impact when applied to IT service organizations.

Kepner-Tregoe has partnered with some of the largest and most successful businesses in the world, such as Microsoft, to improve speed of support, cost of call, and customer experience levels.

About Shane Chagpar, Head of Kepner-Tregoe Digital

Shane Chagpar uses Kepner-Tregoe and industry best-practice tools to design solutions to meet client needs. Typically overseeing multiple projects, Shane is accountable for both team performance and results. Within KT, he provides global thought leadership on IT service strategy, Digital Transformation, and both problem and incident management. He also has expertise in quality improvement, software development, and product management.

Shane Chagpar resides in Canada and implements projects globally for Kepner-Tregoe.