A modern computerized maintenance management system should have the functionality to support a reliability-centered maintenance effort. Here is how they can work together.
Asset optimization at minimum cost is a fundamental principle of modern business management. It translates into getting the maximum uptime and hence the value from any asset or equipment. Most plants have a preventive maintenance (PM) program in place. Usually, equipment condition is assessed each time PM is performed. After a number of PM actions have been performed and recorded on the same equipment, sufficient data can be available to determine whether the equipment needs more or less frequent PM work and how PM frequency should be adjusted.
Reliability-centered maintenance (RCM) is the method that best addresses the requirement for maximum reliability at minimum cost, or more pragmatically, doing the right maintenance at the right time. RCM results in a maintenance program that focuses PM on specific probable failure modes only. It has a strong bias toward condition monitoring and trend analysis of equipment performance.
Nonintrusive condition monitoring, such as vibration monitoring and oil analysis, can reveal deterioration in performance and warn of impending loss of equipment functionality or failure. When sufficient data are available, trending can be used and maintenance can be performed when measurements stray out of a predetermined safe operating range.
Benefits of RCM
RCM is an engineered approach for determining the right proactive maintenance to achieve design reliability at minimal direct maintenance cost. The process recognizes that some failure modes are preventable, some are predictable, and some are entirely random. RCM targets all these failure modes with proactive maintenance activities that prevent, predict, or watch for signs of incipient failures.
Traditional maintenance approaches rely on the recommendations of equipment suppliers or on historic precedence. Although well intentioned, both of these approaches have serious drawbacks. Suppliers usually write only one maintenance manual for their equipment and provide it to all users. Equipment failure can produce widely different environmental, safety, production, or maintenance consequences, depending on equipment location and application, and yet, maintenance recommendations from the equipment manual will be the same for all.
The traditional “we’ve always done it this way” approach is similarly flawed for its lack of focus on failure mechanisms, causes, and effects. It also fails to eliminate historic maintenance activities that may be unsuited to the nature of the failure modes.
Failure of equipment such as a pump has both causes and failure mechanisms. The failure mechanisms are “how” the failure progresses to the point at which equipment function has stopped. These chains of events have an unfortunate and usually predictable conclusion, such as “stopped pumping.” If each component in the equipment is examined closely, it becomes apparent that it has a finite, and often small, number of failure mechanisms unique to the component or to its operating environment. RCM seeks to identify these mechanisms and then addresses either their cause through preventive maintenance or the conclusion of the failure mechanism by predicting when it will happen and taking appropriate steps to prevent the failure of the equipment function.
Performing the appropriate maintenance just in time can produce significant cost savings, in addition to increased equipment uptime. RCM is a powerful tool for optimizing just-in-time maintenance actions. RCM targets only preventable failure causes with actions intended to prevent them, predictable failure mechanisms with actions that take advantage of the predictability of the mechanism, or the tell-tale signs that show the failure mechanism is in its early stages so that steps can be taken to prevent the functional failure. RCM does not result in over maintaining the equipment with actions that do not address specific failure modes.
RCM programs produce the greatest return when applied during the design stages for equipment additions, plant modifications, and new plants. Once a PM program has been established, there is often a great deal of reluctance to change it. In all fairness, RCM will result in a number of maintenance tasks that are identical to those traditionally performed. Nevertheless, many plants are over maintained and many are under maintained. If breakdowns account for more than 20 to 25 percent of the total maintenance workload, a plant can benefit from RCM. A plant that experiences very few breakdowns but suffers from what may be excessive downtime for maintenance or excessive maintenance costs is likely being over maintained.
RCM can result in significant reductions in direct maintenance costs. In one recent application of RCM, the number and frequency of maintenance tasks were reduced by almost 50 percent while availability increased 10 percent. In the aircraft industry, where RCM had its genesis, heavy inspection and overhaul workloads have been reduced by orders of magnitude while aircraft availability and safety performance have increased. In both cases, before RCM was applied, over maintaining was the result of doing things the traditional way.
Achieving optimized maintenance
RCM combines a thorough evaluation of critical equipment to identify failure modes and their effects with a logical process of determining the right maintenance actions focused on the least intrusive methods first. RCM first focuses on determining what is most critical to an operation, assuring that the most benefits are realized as soon as possible. After equipment or systems have been identified and ranked by criticality, the rigorous failure modes and effects analysis process is applied to identify the failures, failure modes, and effects of those failures. Then the analyst has sufficient knowledge of each failure to apply RCM task identification logic and derive applicable and effective maintenance tasks and frequencies.
A rigorous RCM program can take considerable time and effort to put in place. Methods have been developed that guide the user through the steps to determine which equipment needs to be monitored or “registered” in the RCM program. Required data include the operating function and performance specifications. After the types of failures and failure modes that can occur have been determined and the effect of these failures has been evaluated, maintenance procedures to reduce the incidence of failure can be recommended.
The discussion that follows outlines the RCM approach and reviews the type of information support a properly configured computerized maintenance management system (CMMS) can bring to the process. It is based on the authors’ experience with the RCM process and CMMS software supplied by their respective companies.
1. Select equipment and locations to be reviewed
The CMMS database should be three dimensional, with information on the physical asset (or equipment or component), the equipment type, and the location of the asset.
The physical assets database should contain detailed equipment specification data that includes not only nameplate data but also any other physical attributes that the user requires. It could have a critical parts list as well as a list of standard PM tasks and other applicable jobs. A history of costs incurred, symptoms, cause codes, failure modes, and corrective action codes can be accumulated by the CMMS together with a description of the actual work performed, and parts and materials required to complete that maintenance. These data are then available to the planner, the craftsperson, and the manager.
The location database describes the plant’s facilities and physical process locations. Specifications, including operating conditions, can then be maintained for any location. All history data are stored by location as well as by equipment. Costs can then be rolled up the location hierarchy as required. In addition, downtime can be tracked for each location.
The equipment type database contains key data such as parts lists, specifications, and standard maintenance procedures to be stored only once for many pieces of equipment of that type.
It is not necessary and it is often too expensive to set up an RCM program for all equipment and locations in the plant. An analysis of criticality of the equipment is necessary to determine which equipment need not be analyzed or can be deferred to later in the program.
The equipment and locations to be first registered in the program are those that are critical to safety, the environment, and the operation of the process or plant, and that contribute significantly to lost production. Also included are equipment with high maintenance cost, high repair frequency, or low mean time between failures, and equipment that has the most downtime. A modern CMMS should identify equipment or locations that meet any of these criteria and produce reports showing the top 20 items for review.
Once equipment or locations are selected, they can be registered in the reliability program and the database. Registration can then trigger the display of typical symptoms associated with that location or equipment when a work request is created. Additional reliability information can be entered when work orders for registered equipment or locations are completed.
2. Define equipment functions and performance standards
RCM requires definition of primary, secondary, and protective functions of registered equipment. A versatile CMMS should include a facility for defining specifications for equipment and locations. These specifications will typically include operating characteristics and performance standards such as temperatures and pressures, as well as other measurements. The combination of specifications and measurements defines the performance standards for the equipment.
Defining equipment functions requires a thorough review of equipment purpose, both in the process or primary function (such as to pump fluid at a specific range of flows, pressures, and temperatures under specified suction conditions) and in the secondary and protective functions (such as to contain fluid, to prevent reverse flow of fluid from discharge to suction, and to provide visual indication of seal failure). The definition of equipment functions should be specific and detailed and include limits on operating parameters for satisfactory equipment function in its intended role.
3. Determine functional failures
The reliability team evaluates all possible failure modes for each piece of equipment in the RCM program and the failure types for each failure mode. Once determined, these failure types can be entered in a CMMS data table as descriptive information along with an appropriate failure code. This procedure serves the dual function of moving the RCM analysis forward as well as providing valuable information for the ongoing reliability analysis. The failure codes are captured by the equipment and location history function in the CMMS, thus enabling statistical analysis.
Detailed operating parameters were specified when equipment functions were defined. Functional failures occur when equipment performs outside these operating parameters. These off-specification operations, often seen by equipment operators and used as complaints about the equipment or descriptions of equipment failure, are failure types.
Determining all possible symptoms associated with each equipment type provides a further benefit. After the values are entered, they can be displayed when a work request is created for equipment or locations registered for reliability. Selecting these failure descriptions gives the work request description more meaningful information than simply “Fix pump P-101.” RCM will result in the identification of failure types that can be used as failure codes.
4. Identify failure modes and root causes
The failure mode and root cause of any failure is often identified only after a multidisciplinary approach to reviewing failure data. The root cause could be poor maintenance, unstable operating conditions on startup, operation outside the design standards, incorrectly specified equipment, poor engineering design, etc.
The root cause determination is a key step in the RCM analysis for preventable failure modes. Knowledge of the failure modes enables the analyst to identify failure effects that may be detectable with condition assessment technologies such as vibration analysis, oil analysis, and infrared thermography.
In addition to the root cause, there may be contributory causes. Most failures are caused by a combination of events that trigger a specific failure mechanism. For example, a noncontact oil seal combined with the use of a water hose to clean process equipment will result in oil contaminated with water and eventual failure of the bearings. Contributing causes are often reviewed as part of the root cause determination.
A CMMS data table can store a number of “cause codes” for each equipment type. Although contributing causes are not necessarily root causes, they do allow the operator or mechanic to enter any number of known con~tributing causes for a work order. This information is then stored as part of the equipment and location history.
5. Assess failure effects and consequences
It is important to recognize the consequences of potential failures, and recognition implies an understanding of the criticality or significance of the failure.
When the failure mechanisms are analyzed, some of the failure effects become evident. Local or equipment effects are usually obvious. The analysis then progresses up the equipment hierarchy to determine effects at the system and plant levels.
For example, a failed relay contact in a cooling tower fan starter may prevent the fan from starting, which in turn prevents the cooling tower from providing sufficient process cooling, which in turn results in the shutdown of part of the plant because of overheating of the process at certain critical heat exchangers. In this case, in which cooling is critical, failure of a seemingly insignificant relay can have significant operating consequences. Knowledge of the relay’s condition is now recognized as being important, and monitoring relay condition through infrared thermography or regular cleaning of the contacts becomes a worthwhile task. Now assume that the relay in this example results in the shutdown of only one of two fans, each of which can supply sufficient tower capacity for the process. The relay is now far less important, and it may be appropriate to simply wait until it fails and then fix it. Knowledge of the effects of the failures can therefore be important in making decisions on appropriate proactive steps.
6. Define maintenance strategies
Maintenance strategies to avert the probability of failure are determined on the basis of the foregoing analysis. These strategies have two components: what maintenance will be performed and when it will be performed. RCM provides a series of questions in a logical progression using the knowledge of the failure effects as an input to determining appropriate maintenance actions.
The question sequence first looks for hidden failures, failures with no obvious signs or effects that alert operators to the presence of the failure. If hidden failures have safety consequences, a failure finding task is appropriate if the failure cannot be prevented or predicted. If failures are not hidden, those having safety, environmental, production loss, or maintenance cost impacts are defined, in descending order of importance. It is important to determine whether it is possible to
|Monitor for the presence of the failure before it has progressed to a total loss of functional performance|
|Prevent the failure through a time or usage-based action such as an oil change|
|Prevent failure consequences by replacing the failing item at some predetermined time or usage-based interval|
|Take default action|
Default actions are based on the consequences of failures. A failure mode having unacceptable safety or environmental consequences should be designed out of the system or equipment, or it will be necessary to take contingency action to avert the consequences of the failure when it eventually occurs. Similarly, if the impact on production is severe, redesign may be warranted, but if the impact is minor, run to failure may be acceptable.
Through its focus on specific failure modes, RCM avoids the use of maintenance tactics that do not address the failure modes the equipment experiences. Rarely will an overhaul result from RCM because up to 65 percent of all failure modes are random in nature and do not respond to time-based overhaul actions.
Overhauls are typically appropriate only when equipment has a large number of time or usage-sensitive failure modes such as component wear caused by frequent start and stop cycles. In such cases, an overhaul is appropriate to collect component replacement actions into a single combination task. Overhauls that include much inspection work can usually be replaced with condition monitoring. Most overhaul work is followed by a series of smaller problems immediately after startup. These problems can be avoided.
7. Implement and refine the maintenance tactics
No maintenance system is effective without ongoing feedback and reviews. As work orders are completed, additional data can be collected. This practice enables capture of any failures that occurred, and codifies the corrective or preventive actions taken. Up to four codes are captured for all work orders on any equipment or location: failure code, symptom code, cause code, and action code. Useful statistical information can be calculated from these codes and their distribution over time.
RCM increases equipment and plant uptime while reducing costs through the performance of the right maintenance at the right time. It avoids the unnecessary and unwarranted use of well intentioned but misplaced maintenance activities that do not improve reliability, or worse, that result in increased failures.
A modern CMMS can be the foundation for effective RCM. It also can be the foundation for managing equipment reliability as part of normal maintenance management.
RCM is a one-time project that defines maintenance strategies that lead to maximum uptime of equipment at minimum cost. Management of equipment reliability is an ongoing strategy that requires the use of a CMMS with broad functionality that can analyze vast amounts of data collected as part of the maintenance management program. MT
Jim Picknell is a manager at Coopers & Lybrand Consulting, 145 King St., W., Toronto, ON M5H 1V8; (416) 869-1130. His company can facilitate the installation of a reliability-centered maintenance program. Keith A. Steel is senior industry consultant at Revere Inc., 3500 Blue Lake Dr., Suite 400, Birmingham, AL 35243; (800) 411-6614; Internet http://www.immpowerinfo.com. His company supplies the Immpower computerized maintenance information and asset management system.