15 Oct What About the OEMs SLA?
So, now the question becomes how to factor in SLAs. After all, aren’t they really the essence of the maintenance agreement? The real issues that should be addressed in your SLAs include what service standards, equipment functions, and time to repair are maintained. Why else would we even bother?
And frankly, what I see is that most users don’t put enough time into understanding the SLAs, questioning them, or even knowing what it is they are really paying for or what the OEM puts on paper.
Ideally, you will start internally and discuss what SLAs you really need. SLAs relate to what you can accept for downtime, outages, equipment breakdowns, etc., so it’s important to look at what your history has been to answer these questions:
- How many times have you needed to bring in inside or outside maintenance or replace equipment?
- How long were you actually down?
- Which systems or equipment are mission-critical to the extent that almost no downtime is allowable?
- What workarounds or backup systems are available?
- What is the effect of failures on operations or customer service?
- Is 24/7/365 coverage with a 4-hour response time necessary?
Keep in mind we’ve suggested a hybrid approach to your IT Maintenance, and that means that you may need to cover some with OEMs and some with TPMs. Even with that, no one organization truly covers it all. We talk about that elsewhere as we discuss the Ecosystem of IT Maintenance in chapter 8. There are multiple vendors, internal techs, equipment sources, etc. that are part of a comprehensive strategy for SLAs.
As you really look into the details of what the OEMs have as their stated service levels, you may find that it’s not really what you think it is.
Let’s Dive In
At a recent session, “How to Leverage Third-Party Hardware Maintenance Providers for Cost Optimization,” at the Gartner IT Orlando Management Summit, Christine Tenneson was asked if TPMs can meet the SLAs of the OEMs. Christine’s response, “Yes, the OEMs SLA is best effort. They actually offer an SLO (Service Level Objective) not like an SLA that an MSP (Managed Service Provider) might offer with penalties.”
What is she really saying here? The OEMs do not commit to a guaranteed SLA, just an “objective” that is “best effort.” They will also use terms like “replacement parts target response objective,” “limited to commercially reasonable efforts,” “non-mission critical parts may be shipped overnight,” “based on severity level,” etc. Or they will factor in how close you are to a restocking facility or say overnight shipping is required, etc. So, while they are really hedging what they actually provide, you take your SLA literally and look at it as guaranteed. Then, when there is a problem, your expectations are not met.
In some cases, you need to downsize your expectations, as I mentioned above, by realistic internal analysis. You also need to know what you’re really buying. They are happy to sell to an uninformed buyer.
Also know that many TPMs can help you analyze your SLAs realistically, while at the same time exceeding what the OEMs will do.
I’ll Pose a Question Nobody Wants to Ask
Is it reasonable for any OEM or TPM to have every part for every device for every location every time? The simple, honest answer is no. Let’s use IBM, the world’s largest hardware maintenance and support provider according to Gartner’s “Market Share Analysis: Hardware Support, Worldwide, 2017” ($6,228B) as an example. IBM has six major parts depots with the main hub in Mechanicsburg, PA and smaller stocking locations in select cities. Through a key partner, we were engaged with a professional sports league where IBM was the incumbent. The customer revealed they had occasions involving their four-hour SLA when they did not receive their failed part within four hours “due to parts availability.” The point being, even the world’s largest hardware maintenance and support company does not guarantee their SLA.
The thought that the SLA is a type of guarantee versus a “best effort target objective” has led to a “keep up with the Joneses” mentality with some TPMs overstating their true ability to deliver. Some TPMs have made seemingly outlandish promises that far exceed what the OEM offers. Here are just a few we have seen on various TPM websites:
“90% critical on-site parts stocking level-guaranteed.”
“Onsite IT service and support with parts and technicians within 50 miles of your company regardless of your geographic location.”
Do the above claims pass the smell test? Or better stated the “just do the math” test? Simply analyze how many locations and parts stocking facilities would be necessary to provide the coverage noted in these claims. Keeping in mind IBM, the largest maintenance provider on the planet, that has six major stocking locations, does not meet the criteria above. There are 3,797,000 square miles in the USA so the math simply does not add up for a “within 50 miles” claim either.
By the way, we’ve heard of sales reps in the field claiming they have never missed an SLA. Really? Is that even believable given all the different types of service, the range of equipment, and the need for quality field engineers to troubleshoot and fix? Never is a pretty high standard.
What does all this mean? We see much confusion which leads to unmet expectations. The remedy is an open and honest conversation regarding service delivery and certain expectations for worst case scenarios. It does not mean you just throw a “part response-within-4-hours” out the window, but everyone would be better served with being a little more transparent. Have an open and honest dialogue asking:
- Is the SLA a target or a guarantee? and
- If the SLA is a target, then what is the appropriate expectation and worst-case resolution?
Let me give you an example from a real conference call with a potential NetApp service customer. They posed a series of questions:
What is generally stocked in your local depot’s common parts?
Our Response: Yes, typically common parts like drives and power supplies.
What happens when we have a controller fail on a Friday night? (We assume they were wondering if they have to wait until Monday or worse.)
Our Response: We have a 24/7/365 warehouse that has 99% of all NetApp parts we service and would ship out FedEx same day, (meaning the controller will arrive typically within an 8-14 hour window). Potential customers response, “that’s perfectly acceptable.”
The result was we won the opportunity, and a reasonable expectation of worst-case scenario was set. Would the response meet a literal definition of four hours? No. But it set a reasonable expectation of what may occur in a worst-case scenario and allowed the potential customer to evaluate the risk appropriately. If we discover in pre-sales activity that the customer absolutely had to have a controller in four hours, the solution is simple: It would be an additional cost to have a controller placed at their location or depot (onsite spare) dedicated to that customer only, offering a dedicated spare vs. a part of a spares pool.