Reliability economics
Predictive Maintenance ROI in KSA: A Realistic Model
Most predictive-maintenance business cases collapse under a CFO's questions because they lean on vendor headlines instead of your own numbers. Here is how to build an ROI case that survives scrutiny.
Key takeaways
- Build the ROI case from your own measured baseline, not vendor headline percentages.
- Fully loaded downtime cost per hour is usually dominated by lost throughput valued at margin.
- Use the worksheet: downtime hours × SAR cost/hour × conservative reduction %, shown as low/base/stretch.
- MTBF and MTTR are the trackable levers; predictive work raises MTBF, work-order readiness cuts MTTR.
- Clean data — meters, failure codes, downtime tracking — is a prerequisite, and 'up to 30%' is a ceiling, not a promise.
Why most predictive-maintenance business cases fail
A reliability engineer walks into the CFO’s office with a slide that says "predictive maintenance cuts downtime by 30% and saves millions." The CFO asks three questions — Where did 30% come from? What is our downtime actually costing us today? What happens if we only get half of it? — and the case falls apart. The number was borrowed from a vendor brochure, not built from the plant’s own data.
A defensible model does the opposite. It starts from your measured baseline, applies conservative industry benchmarks as a range rather than a promise, and shows the math at every step. For KSA asset-intensive operators — utilities, desalination, petrochemicals, data centers, telecom, and government facilities — the stakes are high enough that finance will scrutinize every assumption. This article gives you a worksheet you can defend line by line.
Start with the real cost of unplanned downtime
The foundation of any maintenance ROI case is the fully loaded cost of one hour of unplanned downtime on a critical asset. This is almost always higher than operators first estimate, because the obvious cost — idle labor — is the smallest piece.
Build the hourly figure from the bottom up so each component is auditable. Most teams find that lost throughput dwarfs everything else, which is exactly why a credible model itemizes it instead of using a round number.
- Lost production or service output per hour (in SAR), valued at margin, not revenue
- Idle operator and maintenance labor during the stoppage
- Expedited spare parts, emergency logistics, and overtime premiums
- Secondary damage from running an asset to failure rather than intervening early
- Contractual or regulatory penalties — SLA breaches, missed supply commitments
- Knock-on effects: quality scrap, restart/ramp-up losses, safety exposure
How PM compliance and condition monitoring reduce that cost
Two distinct levers reduce unplanned downtime, and a good case separates them. The first is basic preventive-maintenance (PM) compliance: are scheduled tasks actually completed on time? Many plants discover their real PM compliance sits well below 80%, which means failures that routine servicing would have caught are still happening. Lifting compliance is often the cheapest, fastest win — it needs discipline and scheduling, not new sensors.
The second lever is condition-based and predictive maintenance: using vibration, temperature, oil analysis, or sensor data to intervene only when an asset shows early signs of degradation. Industry studies — including work summarized by the Society for Maintenance and Reliability Professionals (SMRP) and broader reliability literature — suggest condition-based programs can meaningfully reduce unplanned downtime and maintenance cost versus a purely reactive baseline. Treat those figures as directional benchmarks from the wider industry, not a guarantee for your specific assets.
A simple ROI worksheet you can defend
The core calculation is deliberately simple so finance can follow it. Avoidable annual downtime cost equals downtime hours multiplied by cost per hour. Your projected benefit is that figure multiplied by a conservative reduction percentage. Then subtract program cost to get net benefit, and divide by program cost for ROI.
Work an illustrative example with placeholder numbers you replace with your own. Suppose a critical line suffers 200 hours of unplanned downtime per year at a fully loaded SAR 50,000 per hour — that is SAR 10 million of exposure. Apply a deliberately modest 15% reduction (not the headline 30%) and the avoided cost is SAR 1.5 million per year. If the program — sensors, EAM software, integration, and training — costs SAR 600,000 in year one, net benefit is SAR 900,000 and first-year ROI is 150%. Run a low case at 10% and a stretch case at 20% so you present a range, never a single hero number.
- Annual exposure = downtime hours/year × fully loaded SAR cost/hour
- Projected saving = annual exposure × conservative reduction %
- Net benefit = projected saving − annual program cost
- ROI = net benefit ÷ program cost; payback = program cost ÷ monthly saving
- Always present low / base / stretch scenarios, not one figure
MTBF and MTTR are the operating levers
Reduction percentages are outcomes; the levers that move them are MTBF and MTTR. Mean Time Between Failures (MTBF) measures how often assets fail — raising it means fewer events. Mean Time To Repair (MTTR) measures how long each event lasts — lowering it shrinks every stoppage. Total downtime is roughly the number of failures multiplied by the time each one takes, so your two paths to less downtime are clear and measurable.
Predictive techniques mainly extend MTBF by catching degradation before it becomes failure. Better work-order readiness — kitted parts, clear procedures, the right skills on shift — mainly cuts MTTR. The advantage of framing the case around MTBF and MTTR is that both are trackable month over month, so you can prove progress with hard numbers instead of asking finance to take the savings on faith.
Data prerequisites: no clean data, no credible model
Predictive maintenance is only as good as the data feeding it, and this is where many KSA programs stumble. Before promising any reduction, confirm you can actually measure the baseline and the failure patterns. If you cannot tell finance what your current downtime is, you cannot honestly claim to reduce it.
Be candid in the business case about which prerequisites are already in place and which need investment — that honesty is what makes the rest of the model believable.
- Meter and sensor readings (runtime hours, vibration, temperature) on critical assets
- Consistent failure codes so you can analyze why assets fail, not just that they did
- Disciplined downtime tracking with start/stop times and cause classification
- An asset criticality ranking so effort targets the assets that actually hurt
- Accurate work-order history to calculate MTBF and MTTR reliably
Why "up to 30%" is a ceiling, not a promise
The widely cited "up to 30% downtime reduction" is best read as a ceiling observed in mature programs under favorable conditions — strong data, high-criticality rotating equipment, and disciplined execution sustained over years. It is not a starting point, and presenting it as one will cost you credibility the moment results land lower.
Several factors cap what you can realistically capture. Some failures are random and not detectable in advance. Some assets lack the failure modes that condition monitoring catches well. Early-stage data is noisy, and organizational change — getting crews to trust and act on alerts — takes time. A serious case models the achievable middle of the range, treats the ceiling as upside, and lets results exceed the plan rather than fall short of a promise.
This is also why the strongest programs sequence the work: fix PM compliance and data hygiene first to bank the easy wins, then layer condition-based monitoring on the most critical assets, and only then extend predictive analytics. Each phase produces evidence that funds the next.
Putting the model into practice
A predictive-maintenance ROI case is most persuasive when it is boring: measured baselines, conservative benchmarks cited as industry figures, transparent math, and a phased plan with checkpoints. Lead with PM-compliance and data quality, prove movement in MTBF and MTTR, and let the downtime-cost savings follow.
Tooling matters mostly because it makes the data trustworthy and the levers measurable. A modern EAM platform that tracks meters, failure codes, downtime, and work-order history in one place — with analysis built into the maintenance workflow rather than bolted on afterward — gives you the auditable numbers a CFO will accept. NextEAM is one such platform built for KSA operators, but the discipline behind the model is what wins the budget. Get the baseline right, present a defensible range, and the business case will stand on its own.
Frequently asked questions
- How do I calculate the cost of one hour of unplanned downtime?
- Build it bottom-up: lost output valued at margin (not revenue), idle labor, expedited parts and logistics, secondary asset damage, any SLA or regulatory penalties, and restart losses. Lost throughput is usually the largest component, so itemize it rather than using a round number.
- Is the "30% downtime reduction" figure realistic for my plant?
- Treat 30% as a ceiling seen in mature programs with clean data and disciplined execution, not a starting assumption. Model a conservative base case (often 10–20%) drawn from industry benchmarks, present it as a range, and let actual results exceed the plan rather than fall short of a promise.
- What data do I need before starting predictive maintenance?
- At minimum: meter and sensor readings on critical assets, consistent failure codes, disciplined downtime tracking with cause classification, an asset criticality ranking, and accurate work-order history. Without these you cannot establish a baseline or calculate MTBF and MTTR honestly.
- Why focus on MTBF and MTTR instead of just downtime?
- MTBF (how often assets fail) and MTTR (how long repairs take) are the measurable levers behind any downtime reduction. Tracking them month over month lets you prove progress with hard numbers, whereas a headline downtime percentage is an outcome you can only confirm after the fact.
Evaluating a modern EAM for your operation?
See NextEAM running against a representative slice of your asset registry — hosted in Riyadh, bilingual, with maintenance AI built in.