Home » Reliability Fundamentals: The Building Blocks for a Healthy Facility

Reliability Fundamentals: The Building Blocks for a Healthy Facility

March 30th, 2018 6 Min read

In any large factory, data center, or manufacturing facility, reliability and optimization are key. Without them, the people that they serve and the infrastructures that they support may struggle to survive, let alone thrive. At a time when almost anything can be “smart,” automated, or simply infused with technology, the question becomes not if, but how and which technologies can best benefit your business. Before you get that far, however, you must understand where your reliability program stands and what it is trying to achieve.

Understanding the Maturity of Your Reliability Program

At its core, a reliability strategy is a series of best practices that are intended to optimize the maintenance program of a company or a facility. A robust reliability program is one that enables a company to not only respond quickly to machine issues when they occur, but to prevent malfunctions in the first place.

Getting to the point of consistently preempting failures takes time. When considering implementing or growing a reliability program, it’s vital that you first take a step back to assess where you are on the reliability spectrum: Undeveloped, middle of the road, or mature. While very few programs are considered mature, very few are also entirely undeveloped. Let’s dive into what each of these categories means.

Undeveloped: The undeveloped reliability program has either no or insufficient data, leading to a lack of machine health understanding. Frequently, there is no dedicated reliability staff, no operator involvement in maintenance, and no predictive maintenance (PdM) technology. In many cases, this stems from a lack of knowledge and/or a lack of organizational buy-in.

Middle of the Road: The middle of the road reliability program has some data and frequently performs maintenance tasks based on prior experience and OEM recommendations. Their installation and commissioning practices are not well standardized, and there’s generally opportunistic technology deployment. Some PdM is performed, but the results are neither well tracked nor quantified, and the reliability program is generally managed jointly between engineering and maintenance.

Mature: In a mature reliability program, predictive and preventive maintenance actions are directly correlated to known failure modes. Risks are consistently updated to improve efficiency and effectiveness, and new equipment is designed, installed, and commissioned according to best practices in order to prevent failure. Dedicated personnel manage and improve reliability programs, and operators take active ownership in equipment reliability and basic maintenance tasks. A mature reliability program leverages data to make any and all programmatic decisions.

Choosing Your Technology

You’re not alone if you’ve found yourself struggling to identify the right technology necessary to drive your reliability program forward. As with any decision that affects uptime and revenue, facts – and not assumptions – should be driving the conversation. How? Ensure that you’re doing the right task first, and then focus on pinpointing the technology to optimize it. Identifying the correct task means that decisions surrounding investment into machine diagnostics technology can be made with greater confidence.

Once your tasks and goals are identified, there are three primary methods that you can employ in order to determine the best technology for your needs: Referencing the OEM knowledge sources, A Failure Modes and Effects Analysis, and The Reliability Centered Maintenance Assessment. Moving from the most simple to the most complex, let’s dive in:

Referencing the OEM Knowledge Source

Referencing the OEM knowledge source is the simplest of the three methods in that all it takes is checking in with the equipment manual. Most rotating equipment comes with a booklet that contains a list of preventive tasks, and occasionally some predictive tasks as well. The OEM’s goal in providing this is to ensure that their equipment doesn’t fail and need to be replaced.

The downside of relying on the OEM knowledge source is that in some cases, some of the machine insight may be generic, lacking the perspective unique to each use case or individual facility’s reliability program.

While the OEM knowledge source provides a varying degree of effectiveness, its frequently the quickest and easiest way to get started on building out a reliability program.

A Failure Modes and Effects Analysis (FMEA)

A Failure Modes and Effects Analysis is a much more targeted approach to reliability. This method focuses on specific pieces of rotating equipment, and generally neglects to assess the system as a whole. As with any strong reliability program, data is the crux of the FMEA.

Running an FMEA is helpful in that once you’ve identified the tasks surrounding one piece of equipment, those same tasks are frequently applicable to other machinery that resembles it. This method of determining reliability practices is fast and logical, though not nearly as comprehensive in determining proactive tasks as a full reliability audit: In an FMEA, you’ve identified risks that apply to individual assets, but mitigating actions to reduce or eliminate the risks in the first place are left to inference.

The Reliability Centered Maintenance Assessment

The Reliability Centered Maintenance Assessment (RCM), while closely related to FMEA, is the most thorough and comprehensive way to determine the parameters of your reliability program. Employing a hierarchical method, an RCM produces a structured list of things, called proactive tasks, that can be done to predict and prevent a failure. An RCM provides a clear set of criteria to determine which proactive tasks are feasible and how quickly they should be performed.

The main drawback of performing an RCA is that it is extremely time intensive and it requires a significant amount of accurate data. Unfortunately, this isn’t something that every facility has access to – tracking data on entire systems down to individual assets requires consistent, dedicated effort that many facilities don’t have the bandwidth for.

While not a drawback, significant cross-departmental cooperation is required to successfully perform an RCM. Finding the time to take systems engineers, operations engineers, and reliability engineers all off the floor and into a room to perform an RCM can prove to be a task in and of itself.

So What’s Next?

While it’s true that the most powerful thing you can do for your equipment is to implement a well thought-out reliability program with complete organizational buy-in, Rome wasn’t built in a day. Start from step one to determine where your organization falls on the reliability maturity spectrum. From there, consider implementing or improving your reliability program. Are you undeveloped? Do you have data at your disposal? Consider an RCM to take it to the next level.

At the end of the day, the point of a reliability program is to better care for your assets – no matter your next steps, there’s always room for improvement.

In the meantime, are you not sure where to begin? Get in touch! One of Augury’s reliability experts is always available and happy to speak with you.

Reliability Fundamentals: The Building Blocks for a Healthy Facility

Understanding the Maturity of Your Reliability Program

Choosing Your Technology

Referencing the OEM Knowledge Source

A Failure Modes and Effects Analysis (FMEA)

The Reliability Centered Maintenance Assessment

So What’s Next?

A Better Way of Working Starts Here

Augury in action

Why Do So Many Former Customers Now Work For Augury? (1)

Key Takeaways From Opening Webinar Of ‘Vibration Analysis 101’

Engage Your Maintenance Teams To Drive OEE With Machine Health