Archive | January, 2001


6:46 pm
January 1, 2001
Print Friendly

Is Streamlined RCM Worth the Risk?

New regulatory issues are fixing responsibility for equipment failure that results in loss of life. RCM analysis can help mitigate the risk. But streamlined methods fall short. Here is a look at the issues.

Reliability-centered maintenance (RCM) is a process used to determine what must be done to ensure that any physical asset or system continues to do whatever its users want it to do.

This process finds its roots in work done by the international commercial aviation industry. Driven by the need to improve reliability while containing the cost of maintenance, this industry developed a comprehensive process for deciding what maintenance work is needed to keep aircraft airborne. This process evolved steadily since its early beginnings in 1960. The early history is outlined in the section “Historical Overview.”

The SAE RCM Standard
Various derivatives of Nowlan and Heap’s original aviation-oriented RCM process have emerged since their report was published in 1978. Many of these derivatives retain the key elements of the original process. However, the widespread use of the term “RCM” led to the emergence of a number of processes that differ significantly from the original, but that their proponents also call RCM.

Many of these other processes either omit key steps of the process described by Nowlan and Heap, or change their sequence, or both. Consequently, despite claims to the contrary made by the proponents of these processes, the output differs markedly from what would be obtained by conducting a full, rigorous RCM analysis.

A growing awareness of these differences led to an increasing demand for a standard that set out the criteria any process must comply with in order to be called RCM. Such a standard was published by the Society of Automotive Engineers (SAE) in 1999 as Standard JA1011 Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes (Ref. 3). The evolution of the standard is outlined by Dana Netherton, chairman of the SAE RCM Committee, in the section “The Need for an RCM Standard.”

The elements of a true RCM process listed in the standard are presented here in the section “Key Attributes of Any RCM Process.” Subsequent sections of the standard list the issues that any true RCM process must address in order to answer each of the seven attribute questions “satisfactorily.”

According to the standard, “Reliability-Centered Maintenance (RCM)–Any RCM process shall ensure that all of the following seven questions are answered satisfactorily and are answered in the sequence shown below [emphasis added].” This means that if a process does not answer all the questions in the sequence shown (and which does not answer them satisfactorily in compliance with the rest of the standard), then that process is not RCM.

None of the streamlined processes comply fully with the requirements of section 5 of the SAE Standard. The implications of this point are discussed in more detail later.

Regulatory issues
Society has reacted to equipment failure and accidents producing serious consequences by enacting laws seeking to call individuals and corporations to account. An overview is presented in the section “Worldwide Regulatory Issues.”

Under these circumstances, everyone involved in the management of physical assets needs to take greater care than ever to ensure that every step taken in executing his official duties is beyond reproach. RCM processes that meet the SAE Standard provide a basis for prudent, responsible custodianship of physical assets.

Streamlined RCM
The author and his associates have helped companies to apply true RCM on more than 1200 sites spanning 41 countries and nearly every form of organized human endeavor. We have found that when true RCM has been correctly applied by well-trained individuals working on clearly defined and properly managed projects, the analyses have usually paid for themselves in between two weeks and two months. This is a very rapid payback indeed.

However, despite this rapid payback, some individuals and organizations have expended a great deal of energy on attempts to reduce the time and resources needed to apply the RCM process. The results of these attempts are generally known as “streamlined” RCM techniques.

The main features of some of the most widely touted streamlined approaches to RCM are outlined in the following sections. In all cases, the proponents of these techniques claim that their principal advantage is that they achieve similar results to something which they call “classical” RCM, but that they do so in much less time and at much lower cost. However, not only is this claim questionable, but all of the streamlined techniques have other drawbacks, some quite serious. These drawbacks are also highlighted in the following discussion of the various streamlining methods: retroactive approaches, use of generic analyses, use of generic lists of failure modes, skipping elements of the process, analysis of only certain functions or failures, and analysis of only certain equipment.

Retroactive approaches
The most popular method of streamlining the RCM process starts not by defining the functions of the asset (as specified in the SAE Standard), but starts with the existing maintenance tasks. Users of this approach try to identify the failure mode that each task is supposed to be preventing, and then work forward again through the last three steps of the RCM decision process to re-examine the consequences of each failure and (hopefully) to identify a more cost-effective failure management policy. (This approach is what is most often meant when the term “streamlined RCM” (Ref. 10) is used. It is also known as “backfit RCM” (Ref. 11) or “RCM in reverse.”)

Retroactive approaches are superficially very appealing, so much so that the author tried them himself on numerous occasions when he was new to RCM. However, in reality they are also among the most dangerous of the streamlined methodologies, for the following reasons:

  • Retroactive approaches assume that existing maintenance programs cover just about all the failure modes that are reasonably likely to require some sort of preventive maintenance (PM). In the case of every maintenance program that I have encountered to date, this assumption is simply not valid. If RCM is applied correctly, it transpires that nowhere near all of the failure modes that actually require PM are covered by existing maintenance tasks. As a result, a considerable number of tasks have to be added. Most of the tasks that are added apply to protective devices, as discussed below. (Other tasks are eliminated because they are found to be unnecessary, or the type of task is changed, or the frequency is changed. The net effect is usually an overall reduction in perceived PM workloads, typically by between 40 and 70 percent.)
  • When applying retroactive approaches, it is often very difficult to identify exactly what failure cause motivated the selection of a particular task, so much so that either inordinate amounts of time are wasted trying to establish the real connection, or sweeping assumptions are made that very often prove to be wrong. These two problems alone make this approach an extremely shaky foundation upon which to build a maintenance program.
  • In reassessing the consequences of each failure mode, it is still necessary to ask whether “the loss of function caused by the failure mode will become evident to the operating crew under normal circumstances.” This question can only be answered by establishing what function is actually lost when the failure occurs. This in turn means that the people doing the analysis have to start identifying functions anyway, but they are now trying to do so on an ad hoc basis halfway through the analysis (and they are not usually trained in how to identify functions correctly in the first place because this approach usually considers the function identification step to be unnecessary). If they do not, they start making even more sweeping—and hence often incorrect—assumptions that add to the shakiness of the results.
  • Retroactive approaches are particularly weak on specifying appropriate maintenance for protective devices. As stated by the author in his book Reliability-Centered Maintenance (Ref. 12), “at the time of writing, many existing maintenance programs provide for fewer than one third of protective devices to receive any attention at all (and then usually at inappropriate intervals). The people who operate and maintain the plant covered by these programs are aware that another third of these devices exist but pay them no attention, while it is not unusual to find that no one even knows that the final third exist. This lack of awareness and attention means that most of the protective devices in industry—our last line of protection when things go wrong—are maintained poorly or not at all.” So if one uses a retroactive approach to RCM, in most cases a great many protective devices will continue to receive no attention in the future because no tasks were specified for them in the past.
  • Given the enormity of the risks associated with unmaintained protective devices, this weakness of retroactive RCM alone makes it completely indefensible. (Some variants of this approach try to get around this problem by specifying that protective systems should be analyzed separately, often outside the RCM framework. This gives rise to the absurd situation that two analytical processes have to be applied in order to compensate for the deficiencies created by attempts to streamline one of them.)
  • More so than any of the other streamlined versions of RCM, retroactive approaches focus on maintenance workload reduction rather than plant performance improvement (which is the primary goal of function-oriented true RCM). Since the returns generated by using RCM purely as a tool to reduce maintenance costs are usually lower—sometimes one or two orders of magnitude lower—than the returns generated by using it to improve reliability, the use of the ostensibly cheaper retroactive approach becomes self defeating on economic grounds, in that it virtually guarantees much lower returns than true RCM.

Use of generic analyses
A fairly widely used shortcut in the application of RCM entails applying an analysis performed on one system to technically identical systems. In fact, one or two organizations even sell such generic analyses, on the grounds that it is cheaper to buy an analysis that has already been performed by someone else than it is to perform your own. The following paragraphs explain why generic analyses should be treated with great caution.

  • Operating context. In reality, technically identical systems often require completely different maintenance programs if the operating context is different.
  • For example, consider three pumps A, B, and C that are technically identical (same make, model, drives, pipework, valvegear, switchgear, and pumping the same liquid against the same head). The generic mindset suggests that a maintenance program developed for one pump should apply to the other two.
  • However, pump A stands alone, so if it fails, operations will be affected sooner or later. As a result, the users and/or maintainers of pump A are likely to make some effort to anticipate or prevent its failure. (How hard they try will be governed both by the effect on operations and by the severity and frequency of the failures of the pump.)
  • However, if pump B fails, the operators simply switch to pump C, so the only consequence of the failure of pump B is that it must be repaired. As a result, it is likely that the operators of B would at least consider letting it run to failure (especially if the failure of B does not cause significant secondary damage).
  • On the other hand, if pump C fails while pump B is still working (for instance if someone cannibalizes a part from C), it is likely that the operators will not even know that C has failed unless or until B also fails. To guard against this possibility, a sensible maintenance strategy might be to run C from time to time to find out whether it has failed.

This example shows how three identical assets can have three totally different maintenance policies because the operating context is different in each case. In the case of the pumps, a generic program would only have specified one policy for all three pumps.

Apart from redundancy, many other factors affect the operating context and hence affect the maintenance programs that could be applied to technically identical assets. These include whether the asset is part of a peak load or base load operation, cyclic fluctuations in market demand and/or raw material supplies, the availability of spares, quality and other performance standards that apply to the asset, the skills of the operators and maintainers, and so on.

  • Maintenance tasks. Different organizations—or even different parts of the same organization—seldom employ people with identical skill sets. This means that people working on one asset may prefer to use one type of proactive technology (say high-tech condition monitoring), while another group working on an identical asset may be more comfortable using another (say a combination of performance monitoring and the human senses).

It is surprising how often this difference does not matter, as long as the techniques chosen are cost-effective. In fact, many maintenance organizations are starting to realize that there is often more to be gained from ensuring that the people doing the work are comfortable with what they are doing than it is to compel everyone to do the same thing. (The validity of different tasks is also affected by the operating context of each asset. For instance, think how background noise levels affect checks for noise.)

Because generic analyses necessarily incorporate a “one size fits all” approach to maintenance tasks, they do not cater to these differences and hence have a significantly reduced chance of acceptance by the people who have to do the tasks.

These two points mean that special care must be taken to ensure that the operating context, functions and desired standards of performance, failure modes, failure consequences, and the skills of the operators and maintainers are all effectively identical before applying a maintenance policy designed for one asset to another. They also mean that an RCM analysis performed on one system should never be applied to another without any further thought just because the two systems happen to be technically identical.

Use of generic lists of failure modes
Generic lists of failure modes are lists of failure modes—or sometimes entire FMEAs—prepared by third parties. They may cover entire systems, but more often cover individual assets or even single components. These generic lists are touted as another method of speeding up or “streamlining” this part of the maintenance program development process. In fact, they should also be approached with great caution, for all the reasons discussed in the previous section of this article, and for the following additional reasons:

  • The level of analysis may be inappropriate. It is possible to “drill down” almost any number of levels when seeking to identify failure modes (or causes of failure). The point at which this process should stop is the level at which it is possible to identify an appropriate failure management policy, and this can vary enormously depending once again on the operating context of the system. In other words, when establishing causes of failure for technically identical assets, it may be appropriate in one context to ask “why” it fails once, and in another it may be necessary to ask “why” seven or eight times.

However, if a generic list is used, this decision will already have been made in advance of the RCM analysis. For instance, all the failure modes in the generic list may have been identified as a result of asking “why” four or five times, when all that may be needed is level 1. This means that far from streamlining the process, the generic list would condemn the user to analyzing far more failure modes than necessary.

Conversely, the generic list may focus on level 3 or 4 in a situation where some of the failure modes really ought to be analyzed at level 5 or 6. This would result in an analysis that is too superficial and possibly dangerous.

  • The operating context may be different. The operating context of your asset may have features which make it susceptible to failure modes that do not appear in the generic list. Conversely, some of the modes in the generic list might be extremely improbable (if not impossible) in your context.
  • Performance standards may differ. Your asset may operate to standards of performance which mean that your whole definition of failure may be completely different from that used to develop the generic FMEA.

These three points mean that if a generic list of failure modes is used at all, it should only ever be used to supplement a context-specific FMEA, and never used on its own as a definitive list.

Skipping elements of the process
Another common way in which the RCM process is “streamlined” is by skipping various elements of the process altogether. The step most often omitted is the definition of functions. Proponents of this methodology start immediately by listing the failure modes that might affect each asset, rather than by defining the functions of the asset under consideration.

They do so either because they claim that, especially in the case of a “non-safety-critical” plant, identifying functions does not contribute enough relative to the amount of time it takes (Ref. 13), or because they simply appear not to be aware that defining all the functions and the associated desired standards of performance of the assets under review is an integral part of the RCM process (Ref. 14).

In fact, it is generally accepted by all the proponents of true RCM that in terms of improved plant performance, by far the greatest benefits of true RCM flow from the extent to which the function definition step transforms general levels of understanding of how the equipment is supposed to work. So cutting out this step costs far more in terms of benefits foregone than it saves in reduced analysis time.

From a purely technical point of view, the identification of functions and associated desired levels of performance also makes it far easier to identify the surprisingly common situations (failure modes) where the asset is simply incapable of doing what the user wants it to do, and therefore fails too soon or too often. For this reason, eliminating the function definition step further reduces the power of the process.

The comments in the second bullet in the previous “retroactive approaches” section also apply here.

Analyze only “critical” functions or “critical” failures
The SAE Standard stipulates among other things that a true RCM analysis should define all functions, and that all reasonably likely failure modes should be subjected to the formal consequence evaluation and task selection steps.

The shortcuts embodied in some of the streamlined RCM processes try to analyze “critical” functions only, or to subject only “critical” failure modes to detailed analysis. These approaches have two main flaws:

  • The process of dismissing functions and/or failure modes as being “non-critical” necessarily entails making assumptions about what a more detailed analysis might reveal. In the personal experience of the author, such assumptions are frequently wrong. It is surprising how often apparently innocuous functions or failure modes are found on closer examination to embody elements that are highly critical in terms of safety and/or environmental integrity. As a result, the practice of prematurely dismissing functions or failure modes results in much riskier analyses, but because the analysis is incomplete, no one knows where or what these risks are.
  • Many of the streamlined processes that adopt this approach incorporate elaborate additional steps designed to “help” identify what functions and/or failure modes are critical or noncritical. In a great many cases, applying these additional steps takes longer and costs more than it would take to conduct a rigorous analysis of every function and every reasonably likely failure mode using true RCM, yet the output is considerably less robust.

Analyze only “critical” equipment
An approach to maintenance strategy formulation that is often presented as a streamlined form of RCM suggests that the RCM process should be applied to “critical” equipment only. This issue does not fall within the ambit of the SAE Standard, because the standard does not deal with the selection of equipment for analysis. It defines RCM as a process that can be applied to any asset, and it assumes that decisions about what equipment is to be analyzed and about system boundaries have already been made when the time comes to apply the RCM process defined in the standard. There were two reasons why the equipment selection process was omitted from the standard:

  • Different industries use widely differing criteria to judge what is critical. For instance, the ability of assets to produce products within given quality limits is a major issue in manufacturing operations, and hence features prominently in assessments of criticality. However, this issue barely figures at all with respect to equipment used by military undertakings. This means that there is an equally wide range of techniques used to assess criticality—so wide that it is impossible to encompass this issue in one universal standard.
  • There is a growing school of thought (with which the author has some sympathy) that there is no such thing as an item of plant—at least in an industrial context—that is noncritical or nonsignificant to the extent that it does not justify analysis using RCM. Two of the main reasons for believing that systems or items of plant should not be dismissed as noncritical prior to rigorous analysis are exactly the same as the reasons given in the previous section about critical functions or critical failure for not dismissing functions and failure modes in the same way. (In fact, many organizations that choose to start with a formal, across-the-board equipment criticality assessment seem to spend as much time deciding what assessment methodology they will use and then applying it as they would have spent using true RCM to analyze all the equipment in their facility.)

There is a great deal more that could be said both in favor of and against the idea of using equipment criticality assessments as a means of deciding whether to perform rigorous analyses using techniques such as RCM. However, since criticality assessment techniques are not an integral part of the RCM process, they will not be discussed here. Suffice it to say that it is incorrect to present such techniques as streamlined forms of RCM because they do not form part of the RCM process as defined by the SAE Standard.

Is streamlined RCM worth the risk?
In nearly all cases, the proponents of the streamlined approaches to RCM outlined previously claim that these approaches can produce much the same results as true RCM in about a half to a third of the time. However, the above discussion indicates that not only do they not produce the same results as true RCM, but that they contain logical or procedural flaws which increase risk to an extent that overwhelms any small advantage they might offer in reduced application costs. See “True RCM is Faster.”

It also transpires that many of these streamlined techniques actually take longer and cost more to apply than true RCM, so even this small advantage is lost. As a result, the business case for applying streamlined RCM is suspect at best.

However, a rather more serious point needs to be borne in mind when considering these techniques. The very word “streamline” suggests that something is being omitted, and as has been indicated, this is indeed so for the streamlined techniques described. In other words, there is to a greater or lesser extent a degree of suboptimization embodied in all of these techniques.

Leaving things out inevitably increases risk. More specifically, it increases the probability that an unanticipated failure, possibly one with very serious consequences, could occur. If this does happen, as suggested in the section on regulatory issues, managers of the organization involved are increasingly likely to find themselves called personally to account. If worse comes to worst, they will have to explain, often in an emotionally charged courtroom confronted by bitterly hostile legal Rottweilers, what went wrong and why.

They will also have to explain why they deliberately chose a suboptimal decision-making process to establish their asset management strategies in the first place, rather than using one that complies fully with a standard set by an internationally recognized standards-setting organization. It would not be me that they would have to convince, not their peers and not their managers, but a judge and jury.

One rationale often advanced for using the streamlined methods is that it is better to do something than to do nothing. However, this rationale misses the point that all the analytical processes described above, streamlined or otherwise, require their users to document the analyses. This means that a clear audit trail exists showing all the key information and decisions underlying the asset management strategy, in most cases where no such audit trail has existed before. If a suboptimal approach is used to formulate these strategies, the existence of written records makes every shortcut much clearer to any investigators than they would otherwise have been. (This in turn may suggest that perhaps we should simply forget about all of these formal analytical processes. Unfortunately, the demand for documented analyses embodied in the second wave of safety legislation described in the section “`Worldwide Regulatory Issues” does not allow us this option.)

A further rationale for streamlining says something like “we have been using this approach for a few years now and we haven’t had any accidents, so it must be all right.” This rationale betrays a complete misunderstanding of the basic principles of risk. Specifically, no analytical methodology can completely eliminate risk.

However, the difference between using a more rigorous methodology and a less rigorous methodology may be the difference between a probability of a catastrophic event of 1 in 1,000,000 versus 1 in 10,000. In both cases, the event may happen next year or it may not happen for thousands of years, but in the second case, it is a hundred times more likely. If such an event were to happen, the user of true RCM would be able to claim that he or she exercised prudent, responsible custodianship by applying a rigorous process that complies with an internationally recognized standard, and as such would be in a highly defensible position. Under the same circumstances, the user of streamlined RCM is on much, much shakier ground. MT

This version of the article includes biographic references. Otherwise, it is the same as that published in Maintenance Technology magazine.


1 Nowlan FS and Heap H: “Reliability-centered Maintenance”. Springfield, Virginia. National Technical Information Service, United States Department of Commerce

2 Maintenance Steering Group – 3 Task Force: “Maintenance Program Development Document MSG-3”. Washington DC: Air Transport Association (ATA) of America. 1993

3 International Society of Automotive Engineers: “JA1011 – Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes”. Warrendale, Pennsylvania, USA: SAE Publications

4 Netherton D: “SAE’s New Standard for RCM”. Maintenance (UK) 15 (1) 3 – 7, 2000

5 US Naval Air Systems Command: “NAVAIR 00-25-403: Guidelines for the Naval Aviation Reliability Centered Main-tenance Process”. Philadelphia, Pennsylvania. US Department of Defense Publications

6 RCM Implementation Team, Royal Navy: “NES 45 Naval Engineering Standard 45, Requirements for the Application of Reliability-Centred Maintenance Techniques to HM Ships, Royal Fleet Auxiliaries and other Naval Auxiliary Ves-sels”. Foxhill, Bath, United Kingdom. UK Ministry of Defence Publications

7 UK Health & Safety Executive: “Train Accident at Ladbroke Grove Junction, 5 October 1999″: Third HSE Interim Report”.

8 Bartram P: “What Price a Life?” Financial Director (UK), 2 August 2000

9 Various: “The Longford Royal Commission”:

10 Bookless C & Sharkey M: “Streamlined RCM in the Nuclear Industry”. Maintenance (UK) 14 (1) 27 – 30, 2000

11 Jacobs KS: “Reducing Maintenance Workload Through Reliability-Centered Maintenance Processes”: ASNE Fleet Maintenance Symposium. October 1997. San Diego, California

12 Moubray JM: “Reliability-centered Maintenance”: New York, New York USA: Industrial Press

13 Dixey M & Gallimore J: “Fast Track RCM – Getting Results from RCM”. Maintenance (UK) 15 (1) 2000 8 – 11

14 Mundy S D: “Completing the Reliability Centered Maintenance Loop at a New Process Facility”. Reliability (USA) 7 (3) 30 – 33, 2000

Historical Overview

Reliability-centered maintenance (RCM), a process used to determine what must be done to ensure that any physical asset or system continues to do whatever its users want it to do, finds its roots in work done by the international commercial aviation industry. Driven by the need to improve reliability while containing the cost of maintenance, the aviation industry developed a comprehensive process for deciding what maintenance work is needed to keep aircraft airborne. This process evolved steadily since its early beginnings in 1960.

In 1978, the report “Reliability-Centered Maintenance” (Ref. 1) was prepared for the U.S. Department of Defense by F. Stanley Nowlan and Howard Heap of United Airlines. It described the then current state of the process and formed the basis of the maintenance strategy formulation process called MSG3 (Ref. 2) after the document produced by the Maintenance Steering Group of the Air Transport Association of America. MSG3 was first promulgated in 1980, and in slightly modified form, it is used to this day by the international commercial aviation industry. In the early 1980s, RCM as described by Nowlan and Heap also began to be used in industries other than aviation.

It soon became apparent that no other comparable technique exists for identifying the true, safe minimum of what must be done to preserve the functions of physical assets. As a result, RCM has now been used by thousands of organizations spanning nearly every major field of organized human endeavor. It is becoming as fundamental to the practice of physical asset management as double-entry bookkeeping is to financial asset management.

The growing popularity of RCM has led to the development of numerous derivatives. Some of these derivatives are refinements and enhancements of Nowlan and Heap’s original RCM process. However, less rigorous derivatives have also emerged, most of which are attempts to “streamline” the maintenance strategy formulation process.

Return to article

The Need For An RCM Standard

The evolution of the SAE Standard JA1011 Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes was described by Dana Netherton, chairman of the SAE RCM Committee, in an article “SAE’s New Standard for RCM” (Ref. 4) in March 7, 2000 issue of Maintenance (U.K.), as follows.

Since the early 1990s, a great many organizations have developed variations of the RCM process. Some, such as the U.S. Naval Air Command with its Guidelines for the Naval Aviation Reliability Centered Maintenance Process (NAVAIR 00-25-403) (Ref. 5) and the British Royal Navy with its RCM-oriented Naval Engineering Standard (NES45) (Ref. 6), have remained true to the process originally expounded by Nowlan and Heap. However, as the RCM bandwagon has started rolling, a whole new collection of processes has emerged that are called “RCM” by their proponents, but that often bear little or no resemblance to the original meticulously researched, highly structured, and thoroughly proven process developed by Nowlan and Heap. As a result, if an organization said that it wanted help in using or learning how to use RCM, it could not be sure what process would be offered.

Indeed, when the U.S. Navy recently asked for equipment vendors to use RCM when building a new ship class, one U.S. company offered a process closely related to the 1970 MSG-2 process. It defended its offering by noting that its process used a decision-logic diagram. Since RCM also uses a decision-logic diagram, the company argued, its process was an RCM process.

The U.S. Navy had no answer to this argument, because in 1994 William Perry, the U.S. Secretary of Defense, had established a new policy about U.S. military standards and specifications, which said that the U.S. military would no longer require industrial vendors to use the military’s standard or specific processes. Instead it would set performance requirements, and would allow vendors to use any processes that would provide equipment that would meet these requirements.

The policy voided the U.S. military standards and specifications that defined “RCM.” The U.S. Air Force standard was cancelled in 1995. The U.S. Navy has been unable to invoke its standards and specifications with equipment vendors (though it continues to use them for its internal work) and it was unable to invoke them with the U.S. company that wished to use MSG-2.

This development happened to coincide with the interest in RCM in the industrial world. During the 1990s, magazines and conferences devoted to equipment maintenance have multiplied, and magazine articles and conference papers about RCM became more and more numerous. These have shown that very different processes are being given the same name, “RCM.” So both the US military and commercial industry saw a need to define what an RCM process is.

In his 1994 memorandum, Perry said, “I encourage the Under Secretary of Defense (Acquisition and Technology) to form partnerships with industry associations to develop nongovernment standards for replacement of military standards where practicable.” The Technical Standards Board of the Society of Automotive Engineers (SAE) has had a long and close relationship with the standards community in the U.S. military, and has been working for several years to help develop commercial standards to replace military standards and specifications, when needed and when none existed.

So in 1996 the SAE began working on an RCM-related standard, when it invited a group of representatives from the U.S. Navy aviation and ship RCM communities to help it develop a standard for Scheduled Maintenance Programs. These U.S. Navy representatives had already been meeting for about a year in an effort to develop a U.S. Navy RCM process that might be common between the aviation and ship communities, so they had already done a considerable amount of work when they began to meet under SAE sponsorship. In late 1997, having gained members from commercial industry, the group realized that it was better to focus entirely on RCM. In 1998, the group found the best approach for its standard, and in 1999 it completed its draft of the standard, and the SAE approved it and published it.

After a brief discussion about the practical difficulties associated with attempting to develop a universal standard of this nature, Netherton went on to say:

The standard now approved by the SAE does not present a standard process. Its title is, “Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes (SAE JA1011).” This standard presents criteria against which a process may be compared. If the process meets the criteria, it may confidently be called an “RCM process.” If it does not, it should not. (This does not necessarily mean that processes that do not comply with the SAE RCM standard are not valid processes for maintenance strategy formulation. It simply means that the term “RCM” should not be applied to them.)

Netherton then quoted Section 5 of the standard published here in the section “Key Attributes of Any RCM Process.”

Return to article

Key Attributes Of Any RCM Process

Section 5 of SAE Standard JA1011 Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes summarizes the key attributes of any RCM process as follows:

Reliability-Centered Maintenance (RCM)–Any RCM process shall ensure that all of the following seven questions are answered satisfactorily and are answered in the sequence shown below:

a. What are the functions and associated desired standards of performance of the asset in its present operating context (functions)?

b. In what ways can it fail to fulfill its functions (functional failures)?

c. What causes each functional failure (failure modes)?

d. What happens when each failure occurs (failure effects)?

e. In what way does each failure matter (failure consequences)?

f. What should be done to predict or prevent each failure (proactive tasks and task intervals)?

g. What should be done if a suitable proactive task cannot be found (default actions)?

To answer each of the above questions “satisfactorily,” the following information shall be gathered, and the following decisions shall be made. All information and decisions shall be documented in a way which makes the information and the decisions fully available to and acceptable to the owner or user of the asset.

Return to article

Worldwide Regulartory Issues

The reaction of society as a whole to equipment failures is an aspect of physical asset management that is changing at warp speed.

The changes began with sweeping legislation governing industrial safety, mainly in the 1970s. Among the best known examples of such legislation are the Occupational Safety and Health Act of 1970 in the United States and the Health and Safety at Work Act of 1974 in the United Kingdom. Laws have been passed in nearly all major industrialized countries. Their intent is to ensure that employers provide a generally safe working environment.

These Acts were followed by a series of more specific safety-oriented laws such as OSHA 1910.119: “Process Safety Management of Highly Hazardous Chemicals” in the United States and the “Control of Substances Hazardous to Health Regulations” in the United Kingdom. Both of these regulations were first promulgated in the early to mid-1990s. They are noteworthy examples of a then-new requirement for the users of hazardous materials to perform formal analyses or assessments of the associated systems, and to document the analyses for subsequent inspection if necessary by regulators.

These two sets of developments represent a steady increase in legal requirements to exercise—and to be able to demonstrate that we are exercising—responsible custodianship of the assets under our control. They reflect the steadily rising expectations of society in terms of industrial safety and we have no choice but to comply as best we can.

The late 1990s have seen even more changes, this time concerning the sanctions that society now wishes to impose if things go wrong. Until the mid 1990s, if a failure occurred whose consequences were serious enough to warrant criminal proceedings, these proceedings usually ended at worst with a substantial fine imposed on the organization found to be at fault, and the matter—at least from the criminal point of view—usually ended there. (Occasionally, the organization’s permit to operate was withdrawn, as in the case of the ValuJet airline after the crash in Florida on May 11, 1996. This effectively put the airline out of business in its then-current form.)

However, following recent disasters, a movement is now developing not only to punish the organizations concerned, but also to impose criminal sanctions on individual managers. In other words, under certain circumstances, individual managers can be sent to prison in connection with equipment failures that have sufficiently nasty consequences.

For instance, in the United Kingdom, John Prescott, the minister of transport, has stated that in light of the official inquiry into the Paddington rail crash (Ref. 7) that occurred in 1999, he will introduce a law for a crime to be called “corporate killing,” part of which will entail prison sentences for specific executives (Ref. 8). In the United States, following the outcry about the accidents involving tire tread separation on SUVs, section 30170 of the “Motor Vehicle and Motor Vehicle Defect Notification Act” was revised in October 2000 to include prison sentences of up to 15 years for “directors, officers or agents” of vehicle manufacturers who commit specified offenses in connection with vehicles that fail in a way that causes death or bodily injury.

There is considerable controversy about the reasonableness of these initiatives, and even some doubt about their ultimate enforceability. However, from the point of view of people involved in the management of physical assets, the issue is not what is reasonable, but that we are increasingly being held personally accountable for actions that we take on behalf of our employers. Not only that, but if we are called to account in the event of a serious incident, it will be in circumstances that could culminate in jail sentences.

Perhaps the most startling legislative developments of all were triggered by an industrial accident that occurred in Australia. Following the Longford gas plant explosion (Ref. 9) in September 1998 in the state of Victoria, the Victorian State Parliament on November 13, 1998 added a new section to the State of Victoria Evidence Act of 1958 which reads as follows:

19D. Legal professional privilege

(1) Despite anything to the contrary in this Division, if a person is required by a commission to answer a question or produce a document or thing, the person is not excused from complying with the requirement on the ground that the answer to the question would disclose, or the document contains, or the thing discloses, matter in respect of which the person could claim legal professional privilege.

(2) The commissioner may require the person to comply with the requirement at a hearing of the commission from which the public, or specified persons, are excluded in accordance with section 19B.

In essence, this amendment suspended attorney/client confidentiality for the purposes of the Longford—and subsequent—official inquiries.

Not only this, but the state governments of Victoria and Queensland are considering legislation to deal with “Industrial Manslaughter (Vic)” and “Corporate Culpability (Qld),” as both governments believe that their current legislation does not deal adequately with industrial incidents causing death or serious injury. Victoria is leading the way after the Longford incident. These proposed laws go further than the laws in the U.K. and the U.S., in that the concept of “aggregation of negligence” is introduced. This allows the aggregation of actions and omissions of a group of employees and managers to establish that an organization is negligent. Both governments have made it clear that if managers and/or a management system fails to prevent workplace death or serious injury, then the responsible manager and/or management team is likely to face criminal prosecution. If the legislation proceeds, penalties of over $500,000 and 7 years imprisonment are proposed.

The message to us all is that society is getting so sick of industrial accidents with serious consequences that not only is it seeking to call individuals as well as corporations to account, but that it is prepared to alter well-established principles of jurisprudence to do so. Under these circumstances, everyone involved in the management of physical assets needs to take greater care than ever to ensure that every step they take in executing their official duties is beyond reproach. It is becoming professionally suicidal to do otherwise.

Return to article

True RCM Is Faster

An interesting footnote to the debate about streamlined RCM concerns what exactly it is that is ostensibly being streamlined. Nearly all the advocates of streamlined processes compare their offerings to something they call “classical” RCM. However, closer study of what they mean by “classical” RCM reveals that it is often a monstrously complicated process or collection of processes that bears little or no resemblance to RCM as defined in the SAE Standard. In these cases, it is hardly surprising that streamlined RCM is cheaper and quicker than these so-called “classical” fantasies. In reality, if true RCM is applied by well-trained individuals to properly defined and managed projects, it is nearly always quicker and cheaper than the streamlined versions, in addition to being far more defensible and producing far greater returns.

Return to article

Continue Reading →


2:58 pm
January 1, 2001
Print Friendly

Golden Oldies



Robert C. Baldwin, CMRP, Editor

Not long ago, a long-time reader called to ask how to find a certain maintenance management article on our Web site He was unable to find it and told me the search capabilities on our site left a lot to be desired. As it turned out, the article was published before our Web site went up, so the article wasn’t posted.


His situation illustrates one of the problems of getting information. Resources are extensive, but not always available via the Internet, and your thought process may be different than the people doing the indexing and assigning keywords for the retrieval system. And that system may not be intuitive.

As far as our site is concerned, we simply group articles by subject categories: maintenance management, predictive maintenance, shaft alignment, etc. A visitor enters the site, clicks “Articles Online” on the left column, selects the general topic, and scans the titles and descriptive paragraphs to select the article.

As an example, in “Is Streamlined RCM Worth the Risk?” in this issue, author John Moubray refers to an article in another magazine written by Dana Netherton about SAE’s RCM Standard. If you remembered that MAINTENANCE TECHNOLOGY magazine had carried two articles by Netherton shortly before the ratification of the RCM Standard, you could read them online by visiting the “Articles Online” section of and clicking maintenance management and scanning down until the article information comes into view and then clicking on the article title.

The article also can be found using Internet search facilities. The one I use most often is Alta Vista. Entering reliability centered maintenance netherton brings up 3,264,604 hits. Using the Alta Vista’s advance search capability and entering text:reliability centered maintenance and text:netherton brings up 5961 hits, the top three being Netherton’s company home page and the two articles.

The search can be limited to the MAINTENANCE TECHNOLOGY site by adding and as an additional search parameter. Now, only the two articles appear.

Not everything is available on the Internet. In our case, we have lots of older articles, including three by Moubray on “Redefining Maintenance” listed in our index. To get the index, go to and click on “Index” in the box in the upper right corner of the home page. The index is available as a Microsoft Word file or an Adobe Acrobat file.

Index entries, grouped by subject, include title, author, and issue, and cover a host of “Golden Oldies.” MT


Continue Reading →


2:56 pm
January 1, 2001
Print Friendly

Maintenance is NOT a Department


Robert M. Williamson, Strategic Work Systems, Inc.

Modern business has organized itself into a very noncompetitive position. In most manufacturing plants and facilities, there are numerous “departments” with their own organizations and budgets. A maintenance department is often one of those. Unfortunately, maintenance is not a department. Maintenance means “sustaining; preserving good working order, optimum condition, or a level of performance,” not being on call to fix things. Too often, the maintenance department is looked upon as the sole maintainers of equipment, facilities, processes, and buildings. They cannot do it alone anymore.

Equipment has maintenance problems, and the company has a department. Now is the time to re-focus modern business on addressing maintenance problems regardless of the department structures, sharing responsibility for maintaining equipment, facilities, and buildings.

One of the plants we are observing has operated for more than 20 years with a “fix-it” mindset and a maintenance department with a tight budget. Today, for example, it has three air compressors, two of which operate to supply air to the thousands of small air leaks throughout the plant. This condition didn’t just happen overnight. It took 20 years of a typical maintenance approach—fixing things that break and tightening the budget—to get there. The reliability engineer estimates that they budget and spend more than $200,000 each year to operate and maintain these compressors just to supply air to the leaks. Add to that cost the initial capital investment for those extra compressors and the ongoing electrical usage. This represents a controllable expense that the maintenance department was helpless to address because the air leaks are located in the production departments and were not seen as a maintenance problem.

But the plant mindset is beginning to change. The maintenance department was restructured with “reliability leaders” responsible for each of the primary manufacturing areas. Maintenance management and skilled planners and crews now have responsibility for defined areas of the plant. Fourteen months ago, they engaged production and maintenance management along with maintainers, operators, and process quality people to focus on improving the performance and reliability of one of the plant’s most critical constraint processes. It worked! (See Viewpoint 3/00) Performance and reliability improved significantly and has been sustained. These new maintenance methods also have begun to spread to other similar processes. The results:

  • Availability continues to climb.
  • Production throughput has more than doubled.
  • Fewer operators are required.
  • Maintenance costs declined nearly 16 percent.
  • A capital project to add another machine was cancelled.
  • Maintainers and operators have more time to focus on preventive maintenance of critical equipment.

Surprisingly, these improvements are not the most significant. The plant now has production management in four different areas applying the same team-based maintenance techniques to their critical processes. A sense of ownership is emerging because production and maintenance are working together to eliminate the causes of poor performance and reliability in sustainable manners. The plant manager is whole-heartedly endorsing, encouraging, and in 2001 holding the production department leaders responsible for this new maintenance and reliability strategy as part of their business and performance objectives. Their work culture is changing. Wonderful things begin to happen in a work culture when maintenance ceases to be a “department” and emerges as a “responsibility” that everyone shares.

In addition to the significant tangible results, there are numerous intangibles:

  • Communication improves between the maintainers and the operators.
  • Better understanding of the equipment functions develops.
  • More minds look for ways to make equipment easier to operate, inspect, and maintain with fewer problems.

These intangibles obviously lead to more tangible performance results. This is proof enough that operating costs will decline and performance will improve when more business leaders learn that maintenance is not a department but a shared responsibility to preserve equipment, building, and facility condition. MT
Continue Reading →


2:10 pm
January 1, 2001
Print Friendly

Hand-Helds Improve Maintenance Productivity and Data Management

Strong demand for Hewlett-Packard’s (HP) personal computers, printers, high-tech health care instruments, and other products has made the company a leader in the computing industry and one of the country’s most successful corporations. HP’s inkjet manufacturing plant in Corvallis, OR, has played a key role in that growth. The booming demand, however, raised significant challenges for plant infrastructure support, particularly for the 45 facilities technicians at the site.

As the plant expanded rapidly with HP’s success, its management team in May 1998 moved to develop a Total Productive Maintenance (TPM) program to handle the additional work and focus on plant asset management. The TPM program would rely on more accurate and timely data to keep the plant running at peak performance and minimize downtime from system and equipment maintenance. But because of the plant’s quick growth and increasing age, the TPM program began well behind the curve of a world-class maintenance program.

“Our vision was that improved maintenance practices would measurably contribute to reduced costs and better quality of the finished product,” said Thomas J. Woginrich, the plant’s maintenance and reliability program manager. “Our goal was to become an organization continually learning about itself, its customers, and its customers’ needs.”

HP had selected PSDI’s, Bedford, MA, MAXIMO computerized maintenance management system (CMMS) to improve operations, but the plant still suffered from work-order backlogs and inefficiencies due to its reliance on paper work orders. To help build the TPM program and confront these productivity issues, Woginrich and Corvallis management turned to SMART for Maintenance, a handheld computing solution developed by Syclo, Barrington, IL. The solution allows technicians to use HP’s own Jornada Windows CE-based handheld computers as their electronic clipboards, automating every aspect of data collection and dissemination while providing technicians with real-time information from the CMMS.

The decision to deploy SMART as part of the TPM program came after years of staggering growth at Corvallis. In just six years, the plant expanded from four to 11 buildings, covering 2.1 million sq ft. But the plant’s focus was on maintaining manufacturing production throughout the growth, not on cost-competitive maintenance procedures or life-cycle management of plant assets. Corvallis was using its CMMS for project management, which is not its true purpose.

HP realized its paper-based maintenance system was slowing productivity by making inefficient use of its skilled technicians’ time. Tradesmen were spending valuable hours handling work orders and data entry. To combat growing work-order backlogs, HP upgraded its CMMS and rebuilt its workflow processes. Then the company deployed the hand-helds and used its support for Ethernet communication to synchronize the connection to the HP network.

Technicians are required to transmit completed work information from the hand-helds twice a day—which immediately updates the CMMS—and then receive any new assignments or changes. With the off-line capability, technicians are able to interact with the CMMS untethered, delivering mobile access for complete automation of maintenance processes.

Rapid deployment of the hand-helds and their easy-to-use technology helped the plant meet its TPM goals, realizing swift productivity gains by giving its technicians more wrench time. Tradesmen benefited from immediate access to critical data to handle both critical-response tasks and preventive maintenance. After implementation, each of the plant’s 45 technicians is saving an average of 43 min per day—the equivalent of adding five technicians per day. Those savings have led to the elimination of mounting work-order backlogs.

HP has been able to eliminate the inevitable errors that accompany paper-based work order systems, allowing the company to keep more accurate records on its parts and inventories. With the success of the TPM initiative, asset life reliability at the plant increased by 47 percent, while costs associated with operations and maintenance of the plant infrastructure dropped by 25 percent. In addition, support staff that once handled paper work orders were reassigned to more productive administrative functions.

Woginrich noted that the flexible, collaborative approach to providing HP with a strong maintenance management solution was crucial to the success of its new TPM program.

“Syclo was focused on developing the right solution for our situation,” Woginrich said. “With its help, we have fulfilled our commitment to becoming agile and mobile in proactively meeting the challenges of our ever-changing business environment.” MT

Information supplied by Syclo, 1250 S. Grove Ave., Suite 304, Barrington, IL 60010; telephone (847) 842-0320

Continue Reading →