Archive | September


2:46 am
September 2, 2004
Print Friendly

Asset Management Approach Transforms Maintenance

From its base in Evansville, IN, Vectren Resources, a $2 billion utility, provides electricity or natural gas to nearly two-thirds of Indiana and 16 counties in Ohio, servicing more than 1 million energy users.

On the electric power side of its business, Vectren operates two power plants, which together use five coal-fired units and six gas turbines to produce about 1400 MW of generating capacity. Keeping these plants running continuously and efficiently is vital. Vectren also sells power on the wholesale market where availability has tremendous financial implications.

“We spend a lot of money maintaining our plants to be sure that they are available when the demand calls for it. Our challenge is to reduce our overall operations, maintenance, and capital spending while keeping the availability high,” said Vectren reliability engineer, David Reherman.

Streamlining the maintenance process
To meet this challenge, Vectren management set out to eliminate inefficiencies in scheduling maintenance, ordering parts, and keeping track of completed work. The process of revamping maintenance operations at its electrical power facilities began in mid-1999 with an evaluation of procedures. To complicate matters, data, labor, and parts are managed in two separate locations about 30 miles apart in the Evansville area. The A.B. Brown and F.B. Culley facilities combined have more than 6000 unique assets and 33,000 individual spare parts. Changing the management of these assets and parts impacts the daily work activities for about 225 maintenance personnel.

According to company officials, the evaluation process was driven primarily by the need to streamline maintenance work and equipment processes more than by any explicit requirements for software functionality. “We weren’t necessarily looking to switch from our old software. It was really more a case of asking ourselves how we could develop a maintenance model that would help us drive down the cost of doing unplanned work and allow us to improve the ratio of planned to unplanned work,” said Gary McCarty, maintenance supervisor at the A.B. Brown facility. “At the end of the process, though, it did become clear that a new maintenance software solution was in order. The trick was to find a tool to help us do this without compromising the process.”

New program benefits
After evaluating several offerings, the Vectren management team chose Avantis software and the Avantis InRIM (Industrial Rapid Implementation Methodology) from Invensys Avantis, Burlington, ON. The software provides an enterprise database that enables Vectren to capture and analyze data about current and historical maintenance work. It also helps keep track of the cost of maintaining any piece of equipment, work orders and labor time, and key performance indicators (KPIs) and benchmarks throughout the maintenance operation.

The software also enables maintenance personnel to interface with other key programs—notably Oracle on the financial reporting, procurement, and accounts-payable side, and with the workforce time and resource planning/utilization software that the company uses.

The Windows look and feel of the software was a plus and helped Vectren to implement the rapid acquisition of replacement parts. “Previously, we would buy a part and it would just sort of disappear. Now, using the Purchase Item Catalog feature, it’s easy to find that part, to purchase it if it is not in our stock, and to keep track of it afterward,” said McCarty.

The software also helped the plants achieve a breakthrough in tracking data at the work order level. Previously, separating expenses from information about what planned and unplanned work was performed was the source of considerable frustration.

“Being able to track information about planned and unplanned work was one of the key performance indicators we were trying to improve on,” said Reherman. “Because of the way the software interfaces with the workforce time tracking program, we are able to get to this data more easily.”

Continuous improvement
After nine months of preparation, the system went live in July 2002.

“Using the program, we publish key performance indicators every two months. We look at work task backlog trends. We track the priority of work completed, so we know whether it was emergency, break-in, or scheduled and whether the work was preventive or corrective. We also get the top 10 system costs year to date and top 10 entity costs year to date that help us determine how to spend our operations and maintenance and capital dollars,” said Reherman.

With this type of information, Vectren knows that since July 2002, for example, it completed 12,000 work tasks; 7300 were corrective repairs and 3500 were preventive maintenance work tasks. In addition 250 safety work orders were completed.

“As planners and supervisors, we want to provide to the workforce the means to efficiently perform the task at hand by giving them all of the resources and information needed to do the job. The new software facilitates this and allows us to track what work was done, the costs involved, failure analyses, and even statistical information for us to do this job smarter if it shows up again. As the reliability engineer, I will be looking at and doing failure analyses to improve equipment performance and life expectancy,” said Reherman.

The success at Vectren is strong testimony to the value of approaching asset management solutions methodically with adequate planning, collaboration, and the right software solutions. Vectren has now completely transformed the maintenance operation at its electrical facilities. Its 225 maintenance workers have changed the way they work and are now actively engaged in the process of continuous improvement. They have better information about what work they have done, what needs to be done, and what can be done to make the process better. MT

Information supplied by Invensys Avantis, 880 Laurentian Dr., Burlington, ON L7N 3V6; telephone (905) 632-6015; e-mail

Continue Reading →


9:15 pm
September 1, 2004
Print Friendly

Drive Package Cuts Auto Assembly Conveyor Downtime


The new drive package at Ford’s Michigan Truck Plant includes (front to back) Stearns 333-3 armature-actuated electric disc brake and a 2 hp right-angle helical bevel Rexnord gearmotor with High Efficiency EPAct inverter duty electric motor. Guards are removed to show the sprocket and chain.

A versatile drive replacement package that handles a variety of applications has reduced downtime, extended service life, and cut replacement costs at a major automotive assembly plant.

The package includes a helical-bevel gearmotor with a hollow shaft and motor brake. It replaces 10 different drive configurations on power roll bed conveyors and related material handling equipment. This project illustrates the ongoing effort to empower the company’s United Auto Workers (UAW) skilled trades personnel to partner with Tier 2 suppliers to solve maintenance problems, thereby increasing minimum time between failures (MTBF) on production equipment.

Skids hold vehicles on conveyors
Ford Motor Co.’s Michigan Truck Plant, Wayne, MI, produces the Expedition and the Lincoln Navigator. Skids holding the vehicles in various stages of manufacture are transported on a series of power roll bed conveyors throughout the entire plant, from their start in the body shop, through the paint shop to their completion in the final assembly shop.

Drive systems for these conveyors consist of electric motors, speed reducers, chain and sprocket drives, and electric motor brakes that hold the skid-mounted bodies in place while assembly operations are performed. Similar gearmotor packages are used on Marmac lift tables in the paint shop as well as on pivot tables.

Problems developed
Previously, the number and variety of these gearmotor packages made spare parts stocking and replacement difficult, expensive, and time consuming. Several different speed reducers, each in right- and left-hand drive configurations with different shaft and sprocket sizes, as well as different brands and sizes of gearmotors used for roll beds, with a variety of horsepower ratios and frame sizes, added to the possibility of confusion when a maintenance worker went to get a replacement.

Another problem encountered with these conveyors was worn keyways, not only on the output shaft where the sprockets are mounted, but also on the input where the motor is mounted. These had been the weakest link. Less than half of the power roll bed conveyors are installed with a variable frequency drive (VFD) that gives them a soft-start capability. Without a VFD, the cycling on and off caused these keyed shafts to take a beating.

The accumulated stresses took a toll on the shafts and keyways of the gearboxes, but because many were behind braces or otherwise difficult and time consuming to inspect, frequent failures occurred.

The increased service factor of the new style gearmotor has solved this early-failure problem on the input shaft. On the output shaft, the problem has been solved by using keyless locking devices that hold the sprockets to the shaft with a fit in excess of a normal press fit.

The extra service factor capacity of the new drive package is more than enough to handle even the most severe applications. The worst cases were on the roll beds with a 1000 lb truck on urethane rollers and no VFD. Steel rollers would allow a load to slide and dampen the effect on the gearmotors, but with the urethane rollers, the load just stopped hard without sliding.

Another problem with the existing worm gear reducers occurred when a malfunction prevented the conveyor from running under its own power or caused a skid to be positioned incorrectly. In these cases, interaction of the worm gear sets made it impossible to push the loaded skid by hand in either direction. Backdriving, changing the reducer shaft position by rotating the output shaft, is simply not a capability of worm gear reducers. Often, it took four or five people to lift the skids up and pull them back into position.

In a few cases, when shafts failed, people simply removed the drive chain and pushed the vehicle bodies along the conveyor by hand temporarily, rather than shut down the line. The problem was especially acute in robotic areas, where light curtains prevented a maintenance worker from being in the area unless the equipment was shut down.



The adapter base speeds installation and alignment of the new gearmotor package in place of several others. Slotted holes make aligning sprockets easier.

Developing a universal package
After some research, it was determined that a single basic package would fit a broad range of applications by standardizing on a single hp size, ratio, and gearbox size, with only a few modifications in mounting. The package is built around a 2 hp right-angle helical bevel gearmotor from Rexnord, Milwaukee, WI, which incorporates a 2 hp High Efficiency EPAct inverter duty electric motor equipped with a Stearns 333-3 armature-actuated electric disc brake. The output shaft of the reducer drives a sprocket and chain that provides further reduction to the proper speed for each conveyor and pivot or lift table.

Among potential solutions, almost all manufacturers that mount integral motors directly onto their gear reducers use a key and pinion on the motor shaft. This new gearmotor uses a press fit to secure the pinion gear on the motor shaft which eliminates the possibility of motor shaft keyway failure. On the output shaft, the sprocket uses a Ringfeder shaft locking device, which eliminates output shaft key failures.

In order to support the “low or no” maintenance concept for this new drive package, the Stearns dc brake that was selected requires virtually no maintenance for 3 million cycles, an estimated 12-year life in this application. The brake is direct acting, with only two moving parts.

In operation, when electric power is applied, the armature is pulled by the electromagnetic force in the magnetic body, which overcomes spring action. This allows the brake’s friction disc to rotate freely. When power is interrupted, the electromagnetic force is removed and the pressure spring mechanically forces the armature place to clamp the friction disc between itself and the pressure plate.

This develops the force necessary to overcome any inertia that could cause the loaded conveyor to continue to move. In this application, the brake’s primary function is to hold the skid-mounted body in position until the operation at that location is complete, then release it so it can move on to the next station.

One problem with the previous motor brake was the failure of rectifiers, which were difficult to replace. Because they were in the end bell of the brake, and a brace was in the way, workers could not get to them. With the Stearns brake, the rectifier can be located either in the cabinet with the VFD or in the motor terminal box.

Now, if a rectifier has to be replaced, it takes only about 5 minutes. The brake originally was designed for high-cycling applications in the food and beverage industry and is one size larger than the application normally would require, which ensures service-free operation for 3 years under the plant’s demanding production conditions.


Benefits of new package
Among the benefits of the gearmotor are its greater torque and horsepower capacity, which provides a higher service factor and longer life, and its hollow double-tapered bushing output shaft design. It is easy to convert to either right- or left-hand mounting, and interchanges with previous drives by using adapter plates that are furnished by the gearmotor manufacturer. The Class 12 helical bevel gearing reduces energy costs significantly and, unlike worm gear drives, can be back-driven manually when necessary.

With the previous worm gear drives, there was no easy way to push a skid backwards if it had to be moved. Now with the helical gearing, all the workers have to do is pull the brake release, and the skid can be moved freely in either direction. This saves time and eliminates waiting for four or five people to come and help move it.

Although solid-shaft gearboxes are less expensive initially and therefore are used for original installations, the versatility and ease of replacement makes the hollow-shaft gearboxes less expensive overall as a retrofit item. With this hollow-shaft output feature, maintenance workers have the ability to change shaft sizes, from 1 in. to 11/2 in., so they do not have to stock multiple shafts.

For applications that require a specific shaft size, it is not necessary to buy a different gearbox but only an inexpensive bushing kit. The new gearmotor package accommodates seven different double-tapered bushing sizes and shaft diameters. While gearmotors with hollow output shafts are traditionally shaft-mounted, this application is unique because this gearmotor can be foot-mounted.

Previously, the company had at least 10 different combinations of gearbox brands, motor horsepowers, ratios, and mounting configurations, making it necessary to keep 10 spare gearmotors on hand at all times to be covered in case of failure. The hollow output shaft can be used for either right or left hand, so it cuts the required stock in half.


The new drive package has allowed the plant to reduce its inventory. By standardizing on a single gearmotor design, Ford can maintain a small inventory of different shaft sizes (inset) and two adapter plates that make it easy to replace gearmotors when needed. The versatility of the drive package allows stocking only a small number of replacements that will fit all applications, using different size bushings to accommodate various shaft sizes.

Now maintenance personnel can create their own shaft, slide it in, and put a sprocket on it. They can still foot-mount the gearbox rather than shaft-mount it, but now they can use an interchangeable shaft. They also can do away with the different ratios by changing the sprockets and keeping the speed the same within a few feet per minute. The steel rollers used on most conveyors allow enough slip to take up the slight speed differences.


By standardizing on a 1 1/4 in. shaft diameter hollow shaft for all replacements, the maintenance department now can inventory a small quantity of replacement shafts and sprockets and a few replacement gearboxes in the same configuration, all with a 30:1 ratio.

Because the new reducers are a helical bevel gear design, they will transmit torque more efficiently, with a higher output capacity than the previously used worm gear drives. This allowed Ford to standardize on this same 2 hp motor and brake, instead of the 3 hp motor used previously, on a 90 deg pivot table. The helical bevel reducers have a substantially higher efficiency than the previous worm gear reducers, which translates into a significant annual energy savings.

In addition, many gearboxes were failing every 3-6 months, incurring both the expense of a replacement unit and the 2-hour downtime cost every time one needed to be replaced. By contrast, some of the 29 new drives installed to date have been in service for as long as a year and a half without a problem. One unit was taken apart twice and inspected but did not show any measurable wear.

Easy installation
To make installation of the new drive packages easy, adapter plates with slotted mounting holes are provided. Only two different plates cover all 10 previous configurations, and the slotted holes simplify alignment. When the plant was built, it used two different roll bed designs, each with a different type of drive mounting. Now, a maintenance worker needs to know only the type in order to replace it quickly.

Formerly, aligning the sprocket used to take another 20-30 minutes. It was necessary to loosen the set screws and move the sprocket back and forth, and the set screws usually ended up on the bottom where they could not be reached easily. Now, the slotted adapter allows the worker to leave the mounting bolts loose until everything is lined up and then tighten the bolts down. To ensure uniformity, Rexnord sent a team to the Ford plant and trained maintenance workers and millwrights on all three shifts in the most efficient way to replace existing drives with the new design.

As a result of the success at the Michigan Truck Plant, two other Ford plants also are implementing the use of similar gearmotor packages. In addition, it is being shared with Ford plants worldwide as a Ford Best Practice. Although there may be some slight differences in weight or configuration, the body shops at all plants use the same type of roll bed and perform similar operations, such as installing doors. MT

Information supplied by Gene W. Pokes and Donna Akers , Rexnord, 4701 Greenfield Ave., Milwaukee, WI 53214; (414) 643-3000

Continue Reading →


7:04 pm
September 1, 2004
Print Friendly

Assessing Your Training Needs

Facing the Facts About Maintenance Skills

• Most companies do not have fully skilled maintenance personnel.
• It is hard to fire everyone who is incompetent.
• Hiring skilled maintenance personnel is difficult.
• Most repetitious equipment problems that cost companies billions of dollars
a year are a direct result of skill deficiencies.
• A person that feels competent is a better worker and more easily motivated.
• Often maintenance personnel are disciplined by managers because of skill
deficiency, not because of a lack of concern or commitment.
• People become frustrated or stressed when they do not know the proper way
to do a specific task.
• Companies spend millions of dollars a year on maintenance training without
regard to the results expected from it or without a way of measuring results. (Money spent does not always equal value received.)



How do you know where to start with maintenance skills training? For many of us, that’s the million-dollar question. That training is needed is usually self-evident. But what kind of training, in which areas, and how much training are questions not easily answered. That’s what a needs assessment is about.

In the beginning
The first step in a needs assessment is to identify the problem. Then a needs assessment can determine if training will provide the answer.

As management looks at all aspects of its maintenance organization, it needs to find the answers to some basic questions:
• Will training resolve my problem?
• How much money will I save by implementing this training program?
• How much will the training cost?
• Is there a payback on this training?

Some hints at the answers to these questions can be found in a study, funded by the U.S. Department of Education with the Bureau of Census, to determine how training impacts productivity. Some of the eye-opening results were:
• Increasing an individual’s educational level by 10 percent increases productivity by 8.6 percent.
• Increasing an individual’s work hours by 10 percent increases productivity by 6.0 percent.
• Increasing capital stock by 10 percent increases productivity by 3.2 percent.

Of course, training alone is not sufficient. In most cases, training is only part of the real problem: the lack of an organized and disciplined maintenance process. The development and implementation of a maintenance skills training program must be part of a well-developed strategy. Skill increases that are not utilized properly will result in no changes. Once an individual is trained in a skill, he must be provided with the time and tools to perform this skill and must be held accountable for his actions.

Will training solve my problem?
To answer the question, we must look into the problem. We know from research that 70 percent of equipment failures are self-induced—that is, caused by the introduction of human error.

Not all self-induced equipment failures are maintenance related. Some will be induced by operator error, by being bumped, by vehicles or other equipment, etc.

Work orders are the best source of information to determine self-induced equipment failures. We must identify the true cause of the failures by randomly sampling the work orders of equipment breakdowns over a three-month period. The question to be answered: Was lack of skill (self-induced failure) the problem?

If lack of skill was the major problem, then you can easily estimate the losses due to lack of skill. First, add together the cost of production losses, the cost of maintenance labor, and the cost of repair parts. Then multiply this sum by the percentage of maintenance labor hours attributable to emergency self-induced breakdown work orders. The final figure will be a rough indication of what your plant skills deficit is costing you.

Perform a skills assessment
The skill level of the maintenance personnel in most companies is well below what the industry would say is acceptable. The technical training division of Life Cycle Engineering has assessed the skill level of thousands of maintenance personnel in the U.S. and Canada. The assessments indicate that 80 percent of these maintenance personnel scored less than 50 percent in the basic technical skills required to perform their jobs.

A maintenance skills assessment is a valuable tool in determining the strengths and weaknesses of a given group of employees in order to design a high-impact training program that targets those documented needs. The skills assessment should be based on the critical skills.

Maintenance personnel have often found it difficult to upgrade their technical skills because much that is available is redundant or does not take their current skill level into consideration. The assessment is designed to eliminate those problems by facilitating the construction of customized training paths for either individuals or groups based on demonstrated existing knowledge and skills.

When the assessment is used in conjunction with a job task analysis, a gap analysis can be performed to determine both what skills are needed in order to perform the job effectively and what skills the workforce presently has. All training must be based on a job task analysis.

You must then fill the gaps with training that is performance based. This analysis detail identifies the exact task needed in each skill area so that all training is developed based on the actual job requirements. Gap analysis also ensures that training is Equal Employment Opportunity Commission compliant.

Three aspects of assessment
Each skill area in a skills assessment should have three components:
• Written: Identifies the knowledge required for a specific skill. Theories, principles, fundamentals, vocabulary, and calculation should be among the skills tested.
• Identification: Assesses knowledge in specific skill areas. Employees are asked to name components and explain their uses in this oral assessment.
• Performance: Assesses the critical skills required. To analyze this aspect, employees carry out typical maintenance tasks in accordance with generally accepted work standards.

The written assessment may be proctored by the plant’s own personnel, but certified assessors from an outside agency or a local technical school should perform the identification and performance portions of the skills assessment. This practice ensures that the assessor does not have preconceived notions about what someone knows. Here is an example of why this precaution is important: During an assessment at a paper mill, the maintenance manager pointed to one of his employees and said, “See that man? He is the dumbest mechanic I have.” The results proved otherwise. Out of 250 mechanics he rated as the fifth most skilled.

The assessment data should be analyzed and compiled into a series of reports that depict scores in three ways:
• Company summary, showing a composite of all personnel tested
• Skill results, showing the scores of all personnel by subject area
• Individual results, showing scores of all tests by each person

The results should be shared with company management as well as with the individuals tested.

The assessment report becomes a benchmark study on the status of your existing maintenance workforce and is useful as the tool against which to measure progress or as the profile against which to hire new employees in order to round out the department.

After completion of the assessment process, you can begin to establish performance standards for each employee or for the group, develop a training plan to address the identified needs, develop curriculum to meet training goals, or deliver training in the targeted skill areas.

Increasing pressure to improve productivity and reduce costs is forcing organizations to search for innovative solutions. Targeted training is both effective and efficient, regardless of whether the goal is to design a full apprentice-to-journeyman program or just identify skills for high-impact brushing up.

Time and money spent on a training needs assessment will help you get the most out of the limited training dollars available by helping identify the training opportunities, allowing money to be allocated effectively. MT

Ricky Smith is the executive director of Maintenance Strategies for Life Cycle Engineering Inc. For additional information, contact Richard Jamison at 4360 Corporate Rd., Charleston, SC 29405; (843) 744-7110

Continue Reading →


7:02 pm
September 1, 2004
Print Friendly

Five Steps to an Online Off-the-Shelf Portfolio of E-learning

Creating an e-learning program for your company can be difficult, time-consuming, and expensive. However, if you make the right choices up front, you can minimize hassle and still realize substantial return on investment.

Today there are hundreds of e-learning companies offering everything from off-the-shelf computer productivity courses to custom on-line universities. Just sorting through all of the options can eat up a substantial portion of your training resources.

Since we are focused on saving you time and money, our five-step process sticks to three major principles: online, off the shelf, and a portfolio from multiple vendors.

All of the courses for your portfolio will be delivered over the Internet. This makes it easier for users to learn from home or the road.

It also reduces the amount of cooperation you will need from the IT department. While looking at vendors, be sure that the purchasing and tracking also occur online.

Off the shelf
Custom-made courses are fantastic. However, they are also time-consuming, risky, and expensive. If you do not need industry- or organization-specific content, then you are better off going to the off-the-shelf offerings.

While off-the-shelf courses may not be precisely tailored to your company, they can be very cost-effective. The most common types of off-the-shelf courses cover computer skills, management, and regulatory compliance. If someone has written a book on the topic, there is probably an online course for it.

Multiple vendors
Since the e-learning industry is relatively young and changing rapidly, it is unlikely that you will find one vendor that meets all of your needs. One vendor may have fantastic computer skills courses but nothing for managers; another may offer stunning simulations but nothing for project management.

Our approach allows you to cherry-pick the best from each vendor while keeping the total number of vendors down.

The process
Our five-step process will help you identify what you are looking for and get the right courses at the right price while ensuring that your organization actually uses the e-learning once it is implemented.

1. Laying the groundwork. Determine what you are trying to achieve with e-learning and ensure that e-learning is in fact the most appropriate solution. E-learning works best with self-directed learners with intermediate level Internet skills who have a clear training need. Before researching vendors ensure the following:
• Your learning objectives can be met by off-the-shelf courses.
• Your users can (and will) use the Internet to not only complete the courses but in some cases purchase them.

It is also important to identify and communicate with your stakeholders at this stage. Be sure that you have discussed your initiative with your HR/training department, IT, the supervisors of the learners, and some of the learners themselves. By including your stakeholders at the beginning of the process, you make it much easier to gain their commitment later.

2. User profiles and selection criteria. The easiest way to come up with your selection criteria is to imagine that you are a potential learner or end-user. Identify the types of learners who will be using the content and create a sample profile for each one. These are called user profiles. A user profile will contain all of the relevant characteristics of that type of learner. It should include:
• Level of computer proficiency
• Software/hardware that they have access to
• Learning needs
• Motivation
• Logistical details (access to company credit card or expense account for online purchases)

Combine all of your user profiles to create selection criteria. This is a list of must-have characteristics that include technical, logistical, and financial considerations. Having an objective list of criteria in hand will make your decisions much easier and faster throughout this process. It is also a useful tool for communicating to your stakeholders that you understand their needs. The selection criteria may contain your learning objectives or you may already have a list of courses that you need to purchase.

3. Supplier short list. Start your search at or a similar Web site. Do not forget to examine your software, hardware, and equipment vendors as they may offer product training online.

At this point you want to eliminate as many vendors as possible, so do not waste time going into the courses. Either the vendor can meet your selection criteria or it cannot. Keep track of who you have eliminated and why to prevent backtracking later on in the process.

4. Test the offerings. Now go to each supplier and ask for demonstration accounts so that you can examine the courses in more detail. Once the sales representatives understand what you are looking for, they may be able to make your search more efficient.

When examining the courses, ask yourself if they meet the learning objectives and are appropriate to your organization and audience. A vendor may provide great content but its style may turn off your users. Ask selected end-users and stakeholders to review a few of the demonstration courses and provide feedback. Not only will this ensure that you do not select the wrong content, but it will generate additional buy-in down the road.

Try to use as few vendors as possible; however, an exceptional course may be worth the added hassle of managing one more vendor.

5. Deliver and upgrade. It is crucial to have an implementation plan. A solid implementation plan ensures that users get the benefit of e-learning with as little difficulty as possible. No matter how brilliant your courses are, they are a waste of money if no one uses them. Your implementation plan should account for the following:
• Presentation. How will you present information about the courses (how to order, who should use it, etc.)?
• Motivation. How will you get users in front of the courses?
• Instructions. What instructions will they need to purchase and complete the course? Will they need to keep track of user names and passwords?
• Feedback. How will you know if the program is working (surveys, interviews, supervisor feedback)?

You will need to have your IT department onside, as they will inevitably get calls from confused users. Help them by providing a support contact and communicating the process to your end-users up front.

Creating your own online university is time consuming and requires all of your organizational and management skills. However, the pay off is in the delivery of the right training to the right people at the right price. MT

Jason Lewis is a consultant at ExperiencePoint and has designed and developed online learning and management training. He can be reached at (602) 488-7786

Continue Reading →


5:35 pm
September 1, 2004
Print Friendly

Maintenance Outsourcing Is the Answer, Or Is It?

There is no single position regarding maintenance outsourcing that is correct for all organizations. With this in mind though, I have always believed that there is tremendous value in retaining core maintenance competencies in capital-intensive industrial environments and developing internal maintenance expertise on equipment that is key to the manufacturing process.

To successfully support physical assets, a high level of knowledge and skill needs to be present, as well as a strong sense of ownership for the performance of the assets. To have an environment that emphasizes positive thinking and to incorporate continuous improvement into the way things are done it is imperative to look at the pros and cons of maintenance outsourcing before making your decision.

Work identification suggests that there is a minimum maintenance workload associated with the management of every physical asset. If this minimum workload was defined for all assets and an attempt made to balance the timing of this work, an optimal maintenance resource level for the organization could be determined. From a skill and ownership point of view, it makes sense to have a properly sized workforce to address the workload associated with these assets, generally leading one to recommend that these resources be internal. Peak workloads and noncore maintenance activities, such as carpentry and painting, can be contracted out, providing flexibility and allowing organizations to focus on building a knowledgeable and committed maintenance workforce.

In this case, maintenance contracting refers to an organization hiring external resources to perform maintenance, while the maintenance being performed remains under the direction of the corporation. Maintenance outsourcing means handing over accountability and responsibility for the entire maintenance process to a third party.

In the above scenario, I subscribe to maintenance contracting to address peak workloads and to gain access to specialized expertise. A maintenance outsourcing model could successfully be used to accomplish the same end, where a dedicated group of individuals from an external organization are assigned to the maintenance of specific assets. Outsource personnel could be provided incentives to develop a sense of ownership, training and, in time, the experience to develop expert knowledge and skill.

The caution, however, is that it is quite probable if externally owned and controlled, the personnel and therefore the expertise could be lost if and when the supply contract comes up for renegotiation. It is important to note that, typically, outsourcing resources also have a high turnover rate, making it difficult to create continuity with the equipment they are looking after.

There are key factors that are essential to consider prior to making any decisions:

• A major part of the outsourcing controversy stems from a lack of in-depth understanding by management of the many contributions maintenance can make to the success of the enterprise.

• Maintenance is too often thought of as a liability and not an asset; a service and not a business partner. As a result, there is limited investment in people providing the maintenance function. If companies do not invest in their people, they will not achieve a proficient, stable workforce and it will be next to impossible to develop personnel with the knowledge and skills required that allow companies to differentiate themselves from the competition.

• Many companies believe that the value of outsourcing lies in bringing process, technology, and practices to their plant. In reality, most outsourcing firms do not have a true understanding of the concepts of proactive, predictive, and process-based maintenance.

Successful companies that invest in sustainable growth recognize the strategic impact maintenance can have on their business and are prepared to invest in their own people, not advocate responsibility to a third party.

My core ideology is against maintenance outsourcing. However, I would be the first to admit that in some situations, a business case may exist to outsource. Some examples might include:

• A lack of sufficient skilled trades in a geographic area (ensure the outsourcing organization is not also restricted by geography otherwise it will be unlikely that you will get a better caliber of professionals for maintenance improvement initiatives).

• The nature of equipment maintenance required is highly cyclical with extended periods of low maintenance demand.

• The nature of the equipment is highly specialized where an external organization has the expertise and it is not cost effective to build it internally.

• The physical size of the facility is too small to invest in a world class maintenance function. MT
Continue Reading →


5:25 pm
September 1, 2004
Print Friendly

Olympic Notes

bob_baldwinI had been looking forward to the Olympic Games of 2004 with the hope of being able to see some of the fencing, the sport I’ve been competing in, off and on, for about 50 years. This year’s games were expected to be special because for the first time, the United States was going into battle with a strong chance of winning several medals over the traditional powerhouses of Italy, France, and Hungary.

Unfortunately, I missed most everything because I was traveling and the hotel did not have the channels I needed for middle-of-the-night viewing of an obscure sport. However, I thought I could at least check out some of the action over the high-speed Internet connection in my room, but the Olympic feed did not take American Express, only Visa, which I do not have. Unable to identify myself as an American eligible to view the Olympic Internet feed licensed to the U.S., I had to settle for basic broadcast coverage, and very little fencing.

Now, let the sports metaphors, similes, analogies, and comments begin:

  • Success in not always proportional to available resources. Although the U.S. garnered more medals (103) than any other country, fourth place Australia was able to take home half as many medals (49) despite having less than one-tenth the population. Australians are effective in the maintenance and reliability arena, too. Monash University (, Australia’s largest, offers a number of distance learning opportunities in maintenance and reliability engineering at the graduate level in cooperation with the University of Tennessee.
  • Past success is no guarantee of future success. The favored American men’s basketball team went into the competition with a 109-2 record and came out with a 114-5 record, holding on to a bronze medal. If you have a great maintenance and reliability program and your equipment is in great shape, don’t assume it will continue as such. Performance will degrade unless you invest enough energy to resist the downward trend (check out “Time’s Arrow” on page 11).
  • Stars typically start early and train hard. Mariel Zagunis, 19, who won the women’s saber event, earning the first gold medal by an American fencer in 100 years, started fencing when she was 10 years old. Although we cannot begin training our maintenance stars that early, we can certainly train members of our current team, including ourselves, to Olympic standards. Tom Byerley’s comments on page 26 provide some suggestions of where to begin.

What is your training plan for 2008? How about the rest of 2004? MT


Continue Reading →


5:24 pm
September 1, 2004
Print Friendly

Get Your Own E-Mail Address

The Internet has changed how information is accessed and e-mail has changed how information is delivered. These developments have been nothing short of revolutionary, and many people have derived tremendous benefit from this new medium.

In my business as an e-format business information publisher, we are facing huge challenges delivering our message to more than 30 percent of those subscribers who have actually requested it.

It is not hackers, viruses, or worms that block our e-mail messages. It is not the government or any law enforcement agency. Who is blocking e-mails from being delivered to more than 30 percent of the people who request it? It is the kid who runs your corporate computer network. He is empowered to decide what information gets through and what does not.

He may not like e-mail messages that include graphics and photos, so will block all HTML (Web-page type) formatted e-mails. He may not like plain text e-mails that have Web links, so will allow the message through, but will disable the hyperlinks.

Some corporate networks are so ultra-secure that even when a willing recipient and a willing sender cooperate directly to solve the problem, e-mail delivery is still not possible.

Not all the people who run corporate networks are kids with pocket protectors and X-Box fever but they all share a disdain for the flood of junk e-mail, or spam, and are doing the best they know how to control and contain it. Until an effective spam solution is found (don’t hold your breath), these network guardians will likely tighten access—not loosen it.

So what is a person who wants to be plugged into industry information and timely newsletters to do?

It is time to strike out on your own and get a free Web-based e-mail account from Yahoo! or Hotmail by Microsoft, or perhaps Gmail by Google. A list of dozens of free Web-based e-mail accounts is available at

By signing up for your own e-mail account you are now the master of your own communication domain. You get 1 GB of storage, spam filter, anti-virus scanning, and access to your e-mail from any Internet-connected computer from anywhere in the world.

Remember, everything you send or receive on your company’s e-mail system belongs to your employer and can (and will) be used against you if it suits the owner. In addition, you are a de facto company spokesman when using your employer’s domain e-mail. With a public e-mail like Yahoo! or Gmail, you speak for yourself and can expect privacy from your employer and others. When posting questions or comments on public message boards (see Online Maintenance Discussion Forums Offer Peer Advice) use your public e-mail account to avoid attaching your opinions and advice to your employer and speak for yourself.

You also can use your new e-mail address when requesting sales literature or filling out Web forms to avoid vendor contact at work. You can use it to store e-mails you received at work that you wish to keep for the future. If you happen to change jobs, you still have your public address and can maintain seamless communication. You also can use your public address on your updated resume to find a better job, one at a company that does not restrict Internet and e-mail functionality.

There are dozens of industry-based e-mail newsletters that deliver valuable advice and information that can help you do your job better and easier.

At Yahoo! you can even get your own name as an e-mail address (example: for just $30 per year. These premium accounts allow single e-mail messages up to 10 MB in size where most corporate networks lock out any messages over 5 MB, a relatively small file these days.

We generally disagree with anything that restricts communication and we support anything that empowers it. A public e-mail allows you to break the bonds that your employer feels are required to protect the company network. It also allows you to be who you have always been first and foremost—an individual. MT

Continue Reading →


3:59 pm
September 1, 2004
Print Friendly

Time’s Arrow

Insights into maintenance practice seen through the lens of the Second Law of Thermodynamics—a scientific rationale for the existence of the maintenance function.

JOHN MOUBRAY 1949–2004

Author John Moubray died suddenly January 14, 2004 in England where he was to conduct training in RCM II, the comprehensive approach to reliability centered maintenance (RCM) that he developed for determining the maintenance strategy for industrial equipment and systems.

Moubray was a giant in the RCM field, forever championing its deployment in full, with no shortcuts. He was always in the vanguard, pushing the envelope of maintenance and reliability theory. MAINTENANCE TECHNOLOGY is fortunate to have been able to publish a number of his articles and editorials, including: “Redefining Maintenance” and “21st Century Maintenance Organization

In his Viewpoint editorial “The Maintenance Mission”, he offered the ideal mission statement:

“To preserve the functions of our physical assets throughout their technologically useful lives to the satisfaction of their owners, of their users, and of society as a whole by selecting and applying the most cost-effective techniques for managing failures and their consequences with the active support of all the people involved.”

He served on the committee that developed SAE standard JA1011, “Evaluation Criteria for RCM Processes,” and the committee revising the MSG3 standard under the auspices of the American Air Transport Association. More than 50,000 copies of his book “Reliability-centred Maintenance” are now in print in several languages.

Aladon, the company Moubray founded in 1986, continues to specialize in the application of RCM. Together with a worldwide network of licensees, it has helped clients to apply RCM on more than 1500 sites in 44 countries.

As an evangelist for RCM, Moubray made his presence felt in every meeting he attended, and his legacy will continue to be felt by the RCM community and the maintenance and reliability community at large.

All of Moubray’s articles published by MAINTENANCE TECHNOLOGY are available online

Maintenance has been evolving steadily as a separate management discipline for the past 60 years or so. A remarkable feature of this evolution has been the absence of a clear understanding of or exposition of any sort of scientific basis for maintenance.

As a result, it could be said that right now maintenance is literally “without foundation.” This may be one of the major reasons why so many maintenance departments still struggle to find their true place in organizations that regard maintenance as an expensive overhead that does not provide a satisfactory return on what it costs, and treat it accordingly.

In fact, such a scientific basis does exist. Not only does it exist, but it has been called “the biggest, most powerful, most general idea in all of science.” It is the Second Law of Thermodynamics.

Here is a brief overview of this scientific principle and an explanation of how it clarifies many of the apparently contradic-tory and sometimes counter-intuitive issues now facing people who wish to formulate cost-effective maintenance strategies. The Second Law demonstrates that, far from being an expensive irritation to be “designed out” wherever possible, maintenance is and will remain a vital and fundamental part of the fabric of modern industrial management for the foreseeable future.

The Second Law of Thermodynamics

The Second Law of Thermodynamics can be defined as follows:

Energy spontaneously tends to flow only from being concentrated in one place to becoming diffused and spread out.*


Fig. 1. Energy from hot bar is diffused into the atmosphere as it cools to lower energy state.

For example, when a red hot steel bar is removed from a furnace, it will cool down (Fig. 1). This happens because the thermal energy in the bar flows out into the atmosphere.

The energy in all types of systems tends to dissipate in this way (unless, as we see shortly, something prevents it from doing so). For instance, when the fuel supply to a gas turbine is shut off, the rotor slows to a stop as its kinetic energy dissipates. Parachutists drift to earth much more slowly than they would without a parachute, because much of the potential energy that parachutists start with is dissipated as the parachute pushes air aside during the descent.

Most people would regard these phenomena as perfectly “natural” because they fit in with what we observe for ourselves. In our daily existence, we observe thousands of such examples of energy being dissipated in accordance with the Second Law.

Our intuitive grasp of the “rightness” of the Second Law—based on endless amounts of personal experience—underpins our psychological sense of time. We sense time passing in the same “direction” as energy spreads out, which is why the Second Law is often referred to as “Time’s Arrow.”

(Note that we would regard it as completely “unnatural” if any of the events discussed above happened in reverse—if the iron bar spontaneously started drawing heat from the atmosphere until it became red hot, or if the turbine started spinning without fuel, or if the parachute spontaneously lifted the parachutist back to the aircraft. These events would all entail energy spontaneously becoming more concentrated rather than spreading out, and we could only imagine this happening if time ran backwards.)


Fig. 2. Energy from paper and oxygen is diffused into the atmosphere as it burns.

Activation energy
Another example of the Second Law in action occurs when anything burns (for instance, paper). Fires dissipate a great deal of energy in the form of heat and some as light. When paper burns, the cellulose in the paper reacts with oxygen to form carbon dioxide and water. The fact that this reaction produces so much energy suggests that the cellulose and the oxygen separately contain more energy than carbon dioxide and water (Fig. 2.)

In general, chemicals tend to react if their molecules contain more energy before the reaction than the molecules formed as a result of the reaction.

If the Second Law is true, one might ask why paper does not just catch fire spontaneously when it is exposed to the atmosphere. It does not because the Second Law states that “energy tends to flow spontaneously … .” The key word in this definition is the word “tends.” Energy will succumb to the tendency to spread out only if nothing stands in its way. In practice, something nearly always stands in the way, at least initially.

In the case of the paper, this “something” is the chemical bonds holding the molecules of cellulose together. Similarly, mountains do not just collapse into a heap of sand, because chemical bonds hold the rocks together. Industrial machines do not just fall to pieces spontaneously partly because chemical bonds hold the individual components together, and partly because fastenings—nuts and bolts, screws, welds, rivets, etc.—connect the components to each other.

So how does paper get to start burning? The answer, of course, is by applying a naked flame. The energy in the flame is sufficient to break the bonds holding some of the hydrocarbon molecules together, in such a way that oxygen is able to combine with the hydrogen and carbon atoms to form carbon dioxide and water. This reaction in turn generates more heat—enough for the paper to continue burning on its own.

The energy needed to trigger the reaction is called activation energy (Fig. 3.).

Most systems need some sort of activation energy to trigger the shift from a higher to a lower energy state. The need for this trigger protects these systems from change. (For example, if a mild steel bar is kept in a perfect vacuum and stays absolutely motionless, it will remain unchanged until the end of time. The bar will begin to change only if it is exposed to the external stresses, or activation energies, that are part of the real world.)


Fig. 3. Activation energy to start paper burning is supplied by an open flame.

Another example of the Second Law in action occurs when a person breaks a wooden stick. As force is applied, the stick bends, and the energy level inside it builds up. In this case, energy is being transferred from the person to the stick. The stick breaks when this energy reaches a level sufficient to rupture the bonds in the stick at the breakpoint. This is the activation energy.

As soon as the stick has broken, the energy level in the stick drops back to its previous level; after all, two halves of a stick will make just as much of a fire as the whole stick. (The reason why almost no energy is lost in the stick itself is because very few bonds are broken at the breakpoint relative to the total number of bonds in the stick.) However, the energy level of the person will have declined by the amount needed to break the stick. This in turn means that the energy level of the whole system—person plus stick—will have declined by a similar amount because it has dissipated in accordance with the Second Law.

So what has all this got to do with maintenance?

The maintenance function exists because things fail. In other words, if things did not fail, there would be no need for a maintenance function. So in order to establish the connection between the Second Law and maintenance, we first need to consider the relationship between the Second Law and the concept of failure.

The Second Law shows us that failures consist of three elements: a failure process, a failure trigger, and a failed state. It also reviews the importance of the relationship between the initial state of a system and its failed state.

The failure process
The processes by which failures occur involve the dissipation of energy. This is illustrated by the following examples:

• Chemical reactions. Energy is dissipated when failures occur that entail chemical reactions, such as burning or rusting. For example, if the paper mentioned previously was (say) a map that someone needed to read and it was reduced to ashes, it would of course become totally illegible. This would make it a complete failure as a source of information, and we have already seen that the process by which it failed entailed moving from a higher to a lower energy state.

• Breaking. We have seen how energy is dissipated when things break as a result of the application of an external force, such as the act of breaking a stick. If a forklift truck smashes into a pump, the truck slows down at the moment of impact while some of its kinetic energy is used to rupture the metallic bonds that hold the pump together. During the impact, much energy is dissipated in the form of noise and heat. The failure process ends with a stationary (probably damaged) truck and a shattered pump, a system that contains less energy than a moving truck and an intact pump.

• Wear. Wear entails breaking groups of atoms off solid objects. This is essentially the same process as that described in the previous paragraph, except that wear takes place a few atoms at a time. As a result, very many more bonds are broken relative to the total number of bonds in the system than is the case when something breaks at a single point. So unlike the broken stick and the bits of the shattered pump, the energy level of the wear particles and of the worn component will decline quite significantly in addition to the energy level in whatever is causing the wear.

• Falling apart. Energy is also dissipated when things fall apart. For instance, when flanged pipe lengths are bolted together, the act of tightening the nuts induces tension in the bolts, stretching them very slightly and clamping the flanges together. If a nut comes loose, the tension in the bolt is released (energy is dissipated) and the bolt contracts slightly. The clamping system—nut and bolt—is now failed because it no longer exerts the force that clamps the flanges together, and in failing it moves from a higher to a lower energy state.

In all the above cases, the affected systems become disorganized, and in doing so, energy is dissipated while the systems drop from a higher level of energy to a lower level of energy in accordance with the Second Law of Thermodynamics.

Note that the Second Law does not necessarily mean that in making the transition from higher to lower energy levels, systems are always broken down into smaller elements. In some cases, the operation of the Second Law entails simple systems becoming more complex, and dropping from a higher to a lower energy level in the process. For example, this occurs when the free elements of hydrogen and oxygen combine to form water, which is a more complex molecule. It could be argued that the result of this process is a more “organized” system.

However, for the purpose of this discussion, let us apply the term “disorganized” to a system that is organized in some way other than it needs to be in order to function correctly (bearing in mind that this always entails the affected system moving from a higher to a lower energy state in accordance with the Second Law). This leads to two general conclusions:

• The processes by which failures occur entail the disorganization of systems that should be organized in a way that enables them to perform satisfactorily

• The process of disorganization entails the dissipation of energy either on the part of the system that becomes disorganized, or the system that causes it to become disorganized, or both, in accordance with the Second Law of Thermodynamics.

Failure trigger
Although the Second Law states that concentrated energy tends to spread out spontaneously, it was explained previously (1) that more often than not, this tendency is blocked by barriers, usually bonds of some sort, that keep existing systems intact, and (2) that some kind of activation energy is needed to overcome these barriers and cause such systems to start dropping from a higher energy level to a lower energy level.

This is true of all the failure processes discussed. The paper needed a naked flame to start burning. The pump had to be hit by a solid moving object in order to shatter. Two objects need to come into sliding contact in order to initiate wear. Some force needs to be applied to the nut in order to start loosening it (such as vibration or alternating expansion and contraction due to periods of high and low temperature).

Let us call activation energy that initiates a failure process a “failure trigger.”

Failure triggers manifest themselves in countless other ways, such as alternating compressive and tensile stresses breaking the bonds in metallic components, causing fatigue fracture, or water freezing and thawing in the cracks in rocks, forcing the cracks to grow and the rocks, slowly but surely, to disintegrate.

Another common failure trigger is human intervention. For example, an operator might select reverse gear while a vehicle is moving forward, applying sufficient activation energy to break the metallic bonds in the gear teeth and shear them off the hub. A mechanic might apply too much torque to a nut, causing the threads to shear or the bolt to break.

Sometimes the barriers to the dissipation of energy are low enough for failure processes to take place spontaneously, without the application of any sort of activation energy. For instance, the electrical energy in standby batteries dissipates while they are on standby, albeit very slowly, until they reach a failed state. In these cases, the “failure trigger” would simply be listed as “spontaneous.”


Fig. 4. Failure mechanism is comprised of two elements: a failure trigger and a failure process.

Failed state
The above paragraphs suggest that when any system fails, what causes it to do so consists of two elements—a failure trigger and a failure process. Let us call the combination of these two phenomena—a failure trigger followed by a failure process—as a “failure mechanism” (Fig. 4.). But what exactly is meant by the term “failed”?

Theoretically, the dissipation of energy will end when all substances in the universe have been reduced to their lowest energy state, nothing is moving relative to anything else, and everything is at a uniform temperature. However, that point will be reached only a long way into the future and is hardly relevant right now.

Conversely, most systems can tolerate a small amount of disorganization without causing any problems. An axe can be covered with a thin layer of rust and still chop wood just fine. Components such as turbine blades, piston rings, bearings, pump impellers, drill bits, and crusher liners can tolerate a small amount of wear and still perform quite satisfactorily.

So at what point does the process of disorganization actually become relevant?

The answer to this question depends on what each system is meant to do. What any system is meant to do is determined by the people who own and/or operate it. These people will consider any such system to be failed if it gets into a state where it cannot do whatever they want it to do.

Technically, what the users of any system want it to do is defined as its “functions.” As long as a system continues to perform these functions to a standard considered acceptable by the users, the users will consider the system to be “OK.” If the performance drops below this level, they will consider it to be “failed.” So a failed state is defined as one in which the performance of a system drops below a standard that is acceptable to the users of that system, and the process of disorganization becomes relevant when it reaches this point. This point is also known as a “functional failure.”


Fig. 5. The failed state usually lies above the lowest (final) energy state of the system or component.

The failed state usually lies above the lowest (final) energy state that the system or component reaches when the failure process is complete (Fig. 5). For example, the burning map discussed earlier will become illegible (failed from the viewpoint of anyone who wants to read it) long before the paper is reduced to ashes.

Since failure can be defined only in terms of the required functions of a system, a clear understanding of these functions and their associated desired standards of performance is essential before any sensible attempt can be made to analyze failures. This is illustrated by the following example.

The most obvious function of a filter element in a circulating oil system is to remove particles above a certain size from the oil. The oil pressure in this system could fail (drop below acceptable limits) because the filter has removed so many particles that it blocks up.

It could be argued that the particles blocking the filter are actually in a more organized state in the filter membrane than if they were floating around freely in the oil. This might lead to the conclusion either that this failure has occurred as a result of a system becoming more organized, or even that the filter is not failed because its function is to remove the particles.

However, the filter has an equally important second function, which is “to allow at least a certain rate of clean oil to pass through the filter membrane.” To do so, the membrane must have sufficiently large gaps to allow clean oil through without causing an excessive pressure drop. As these gaps get blocked by trapped particles, the differential pressure across the filter rises until the downstream pressure drops below acceptable limits.

In the context of this second function, an organized system is one with large enough gaps to allow the clean oil through, and a disorganized system is one where the gaps have been blocked. (As always, this failure process involves the dissipation of energy, because the particles move from a higher energy state—moving—to a lower energy state—stationary—as they get trapped.)


Fig. 6. The initial energy state must be above the failed state for any system to function.

Initial state
Figure 4 raises two further key points about the behavior of systems. The first point is that for it to be possible for any system to function, the initial energy state must be above the failed state when the system is put into service. Otherwise, the system will be in a failed state right from the outset (Fig. 6).

For example, consider a steel rope on a hoist intended to lift (say) 20 tons. When the rope enters service, it must be capable of lifting more than 20 tons to allow for the deterioration that will inevitably occur when the rope enters service (mainly due to wear, a process that was discussed earlier). The attribute of the rope that enables it to be used for lifting is the chemical bonds holding the atoms of the metal together. These bonds enable the rope to transmit the tension in the rope from the hoist to the hook.

If there is sufficient energy in the bonds to withstand a tension of 20 tons, the rope will perform satisfactorily. As the rope wears, its cross sectional area declines until eventually there will be too few bonds left to carry the full load and the rope will snap. However, if the cross sectional area of the rope is too small to contain enough bonds to transmit the tension when the rope enters service—in other words, if it is undersized—then it will fail immediately.

In essence, the Second Law is telling us that when any system is put into service, its initial internal energy levels must be sufficient for it to be able to fulfill the required functions. This may seem to be blindingly obvious, yet it is astonishing how often systems are encountered—or more often, subsystems or even a single component—that are simply incapable of doing what they are supposed to do from the moment they enter service. (In the light of our earlier discussion, it could be said that such systems are not “organized” in a way that permits them to function as intended.)

When this happens, something has usually gone wrong (become disorganized?) during the system design, manufacturing, or installation process. Such systems break down as soon as they called upon to operate (or shortly thereafter), at which point the defect becomes a maintenance issue because maintenance people usually have to rectify it. In other words, they have to reorganize the disorganized system.

The second point raised by Fig. 4 concerns the size of the gap between the initial state of the system and the failed state. Clearly, a big gap means that more energy has to be dissipated before the system descends into a failed state, so (in very general terms) it will last longer and/or fail less often than a similar system with a small gap.

For instance, in the example discussed above, a rope that is initially capable of lifting 22 tons will be able to tolerate much more energy dissipation (will last longer) than a rope made of the same material whose bonds are capable of withstanding a tension of only 21 tons to begin with.

Here the Second Law is telling us that there must be an adequate gap between the initial energy state of any system and whatever constitutes the failed state. This too may seem to be almost embarrassingly obvious. However, when systems fail “too soon” or “too often,” it transpires again and again that the initial gap is simply too small, so comparatively little activation energy (whether applied in a single dose or a series of tiny doses) is needed to put the system into a failed state.

In addition to the points discussed above, looking through the lens of the Second Law provides a number of further insights into the world of maintenance.

Age and failure
In the early days of maintenance, it was generally believed that most systems (or at least, most components) could be expected to operate reliably for a period of time, and then fail. Furthermore, it was believed that identical items performing more-or-less the same duty could be expected to fail at more or less the same age. In fact, some failures do indeed show a clear relationship between age and the likelihood of failure.

However, there is now overwhelming evidence that age-related failures are the exception rather than the rule. For example, as discussed earlier, many failures manifest themselves as soon as the affected system is put into service or very shortly thereafter because of design or manufacturing defects. Another large group of failures shows no relationship at all between how long the items concerned have been in service and the likelihood of failure (so-called “random” failures). Yet in spite of the evidence, many people inside and outside the world of maintenance still have great difficulty accepting the concept of random failure.

In fact, the concept of “activation energy” readily explains both random and age-related failures.

In the case of age-related failures, the activation energy is applied in a series of small doses. Each application lowers the internal energy of the system (what might be called its “resistance to failure”) until it reaches a failed state.

For example, when a metallic component that is susceptible to fatigue is subjected to cyclic stresses above a certain level, each stress cycle applies activation energy that weakens the bonds holding the metal together until enough of them break to cause the component as a whole to break. Identical components exposed to similar cyclic stresses (activation energies) are likely to fail after more or less the same amount of exposure—in other words, at more or less the same age.

Similar logic can be applied to failure processes like wear and corrosion, except that each small application of activation energy removes a small amount of material from the affected component until it too reaches a failed state.

On the other hand, many (most?) failures occur when a single dose of activation energy (or failure trigger) is large enough to cause the affected component to fail immediately, or very soon afterwards. The point in time at which this activation energy is applied may have nothing to do with when the affected system was put into service. So if a number of otherwise identical items is exposed to such a trigger, the likelihood that failure will occur in any one period will be the same as in any other period. This gives rise to what is known as a “random” failure pattern. (Think of the forklift truck smashing the pump.)

However, some random failures do occur in situations where failure is caused by repeated small doses of activation energy. For example, properly lubricated rolling element bearings tend to fail at random, despite the fact that the failure trigger is usually cyclic stresses imposed by rollers passing over the main load-bearing section of the outer race, leading to subsurface fatigue failure. Intuition suggests that this failure should be age related. However, large samples of identical bearings performing more-or-less the same duty usually show little or no relationship between age and the likelihood of failure. Three of the main reasons for this are as follows:

• Small defects and/or minor damage prior to or during installation lowers the initial state of one bearing relative to another, which means that a slightly damaged bearing has a smaller, sometimes a much smaller, margin for deterioration, and hence will fail much sooner than a bearing with little or no damage.

• Small variations in radial load, alignment, concentricity, the presence of particles, and so on greatly affect the magnitude of the activation energy that is applied in each cycle, which in turn dramatically affects the rate of deterioration of one bearing relative to another.

• Serial triggers which are similar in magnitude could cause one bearing to suffer much bigger changes in state than another because of minor differences in the bearing materials.

Finally, situations where the initial state is below the failed state when the item is new or recently overhauled (as shown in Fig. 6) give rise to the failure pattern known as infant mortality.

At present, a major difficulty that afflicts many attempts to manage failures coherently concerns terminology. This difficulty manifests itself in two ways: words used to describe failures and words used to analyze failure.

The words we use to describe specific failures often refer to quite different aspects of failure. For example, in the context of equipment failure, the word “fatigue” calls to mind both the failure trigger (cyclic stress) and the failure process (separation of metallic bonds, leading to fracture). The same applies to words like “wear” and “corrosion.”

Other words like “break,” “shear,” or “shatter” describe only the failure process (separation of bonds again), without giving any hint about the failure trigger. For instance, the pump casing could shatter because it was hit by the forklift truck as discussed earlier, or because the pump was massively over-pressurized for some reason, or because a manufacturing defect in the pump casing made it incapable of containing normal pressures from the moment it entered service.

Yet another group of words tend to describe a failure mechanism as a whole, or even a group of failure mechanisms, without providing any information about either the failure trigger or the failure process. This group includes words like “seizes” and “fails.”

All of these terms are legitimate, so this discussion is not meant to suggest that we should stop using them. However, the Second Law provides a framework that brings much greater clarity to the meaning of the terms themselves, and also to what they mean relative to each other.

At this point in time, a great many terms are used in discussions about failure, such as “failure mode,” “failure cause,” “failure mechanism,” “root cause of failure,” “functional failure,” “failed state,” “potential failure,” and so on. Different schools of thought sometimes give different meanings to these terms, which adds to the confusion. And this is before we start talking about what could be called the by-products of failure, such as failure effects and failure consequences.

This confusion makes it very difficult for members of the physical asset management community, whether they specialize in maintenance or reliability, or both, to discuss specific incidents without getting lost in thickets of confusing or conflicting verbiage and to adopt universally accepted methodologies for developing failure management strategies. Perhaps the biggest single reason for this situation has been the lack of a coherent, scientific framework for considering the whole subject of failure.

In fact, we have seen that such a framework does exist, in the form of the Second Law. Not only that, but looking at failures in the context of the Second Law suggests a more precise list of terms: failure process, failure trigger, failure mechanism, failed state, failure mode, initial state, potential failure, failure effect, and failure consequence, all of which are defined in the accompanying section “Failure Analysis Terminology.

One widely used term that does not appear in the list is “root cause of failure.” This is so for two reasons.

First, the term “root cause” implies that it is possible to “drill down” to a final and absolute level of causation when analyzing failures, usually by asking “why” a number of times. In reality, finding the ultimate cause usually turns out to be impossible. What is more, it is usually unnecessary.

For instance, one might ask why the forklift truck discussed earlier hit the pump. The answer would almost certainly be because the driver drove it in that direction. Note that this action also consists of a failure trigger, turning the steering wheel, and a failure process, truck dissipating energy by moving in a direction that performs no useful work. Asking why the driver drove in the wrong direction could yield any number of answers, some relating to the state of mind of the driver, or to the configuration of the plant, or poor lighting, or whatever. Each of these answers also would involve failure triggers and failure processes. And so on and on.

In fact, we are not actually drilling down, but moving sideways, from one system (smashed pump) to another (forklift truck) to another (the driver) to another (say the lighting). The point at which we stop this analytical process is not actually the root cause, but the point at which it is possible to identify a cost-effective failure management policy. In the case of the pump, this policy might simply be to move the pump to a location where forklift trucks cannot reach it, in which case, from the viewpoint of the pump, further analysis of the antics of forklift trucks and their drivers would be a waste of time.

Second, from a completely different perspective, it could be argued that all failures do indeed have a “root cause.” The discussion in this article suggests that it is the Second Law of Thermodynamics.

Managing failures
Organizations acquire physical assets because they expect them to do something—in other words, to perform a specific function or functions. We have seen that failures interfere with functions, and hence with the business processes of which the assets form part. In doing so, failures destroy value, usually by disrupting value-adding processes. Some of them breach environmental standards. A few even kill people. It also costs money to anticipate, detect, prevent, or correct failures. So one way or another, failures consume time, effort, and money without contributing anything.

The general uselessness of failures means that most people in industry—certainly most operations people—are hostile to them. There seems to exist a vague hope—sometimes even a belief—that if only the organization could do something slightly different, failures would somehow go away.

In fact, failures are not going to go away, because failures are a product of the Second Law, and the Second Law is a fundamental part of the way the known universe operates. It applies to systems of every magnitude, ranging from systems of molecules, such as the sheet of paper, through industrial undertakings and geological formations to galaxies. So dealing with the results of the operation of the Second Law is a fundamental part of our existence in that same universe.

In most cases, the part of the organization that has to deal with the impact of the Second Law on physical assets is the maintenance function. We have seen that in this context, the Second Law manifests itself as equipment failures, so the management of maintenance is all about the management of failure.

Of course, before we can set up a successful failure management program, we must determine what failures are reasonably likely to affect the physical assets in our care. As discussed earlier, an analysis of the failures that could affect any system should start with a clear definition of its functions together with the associated desired standards of performance. Defining functions clearly enables us to define how each function can fail (failed states), which then puts us in a position to identify what failures can cause each failed state. As discussed below, these “failures” should be identified in enough detail for it to be possible to identify a suitable failure management policy.

The next steps are to identify failure effects, then to assess the consequences of each failure (how failures affect safety, the environment, the business process, etc.) The final step is to determine the failure management policy which deals most cost-effectively with the failure consequences.

In the minds of some people, failure management is all about failure prevention, which in turn means fixed interval overhauls or fixed interval replacements. In fact, it is now generally understood that there is far more to the management of failures than these hard-time interventions (although they are still sometimes appropriate). Other options are outlined in the accompanying section “Failure Management Options.”

When establishing failure management policies, it is not necessary to identify every failure process and every failure trigger that might cause every system to get into a failed state. To try to do so would not only be ruinously expensive but it also goes way past the point at which the law of diminishing returns begins to apply. It is not even necessary to identify every failure mechanism.

The key to cost-effective analysis is to identify all the phenomena which could put the system into a failed state, at a level of detail which makes it possible to identify suitable failure management policies. (“Phenomena” identified in this way were defined earlier as “failure modes.”)

Sometimes this level will be individual failure mechanisms or even failure triggers. At other times it will be groups of different failure mechanisms that could all contribute to the same failed state.

The Second Law of Thermodynamics clarifies many of the apparently contradictory and sometimes counter-intuitive issues facing people who wish to formulate cost-effective maintenance strategies. Three key points are summarized in the section “Maintenance and the Second Law of Thermodynamics.” MT

* F. L. Lambert, “The Second Law of Thermodynamics,” March 2003. (The author would like to acknowledge the extent to which Professor Lambert’s paper influenced his thinking on this subject, especially the first two sections of this article. Professor Lambert’s paper is strongly commended to anyone who wishes to start finding out more about the Second Law.)

return to article



1. The Second Law of Thermo dynamics provides much greater clarity than hitherto about the concept of “failure.” Specifically, it shows that any failure is not a single incident, but is actually a surprisingly complex system that embodies two steps which add up to a third (a failure trigger and a failure process, which together amount to a failure mechanism) and three states (initial, end, and failed).

2. The Second Law offers a foundation that could be used to consolidate the wide range of overlapping and at times conflicting failure analysis techniques currently in use around the world (RCM, RCFA, HAZOP, FMEA, FMECA, RBI, and so on) into much fewer, more coherent, and universally understood processes—perhaps even just one process.

3. The Second Law provides a solid, scientific rationale for the existence of the maintenance function. The extent to which this Law governs the behavior of the known universe in general and of physical assets in particular means that maintenance is and will remain as much a part of the fabric of organizations that use physical assets as the assets themselves.


The Second Law of Thermodynamics suggests a number of precise terms for discussing failure:

Failure process: the process by which a system makes the transition from an initial state to a failed state

Failure trigger: the phenomenon that initiates a specific failure process

Failure mechanism: a combination of a failure trigger and the resulting fail ure process

Failed state: a state in which a system is unable to fulfill a function to the sat isfaction of its users

Failure mode: a failure mechanism or group of failure mechanisms identi fied at a level of detail that makes it possible to identify a suitable failure management policy

Initial state: the state of a system viewed from the perspective of a specific failure process, either when that system is new or immediately after it has been restored to like-new condition.

Three additional terms apply to the analysis of failures:

Potential failure: a clearly identifiable phenomenon which indicates that a fail ure process is reaching or is about to reach a failed state

Failure effect: what happens when a system reaches a failed state as a result of a specific failure mode

Failure consequence: how and how much a specific failure mode matters.

back to article


In addition to the obvious failure prevention approach that focuses on fixed interval overhauls or fixed interval replacements, there are a number of other failure management options available.

1. Predictive or condition-based maintenance, which entails checking for potential failures. Where appropriate, predictive techniques include the application of the human senses, the use of specialized condition monitoring equipment, product quality monitoring, and the direct monitoring of equipment performance.

2. Failure-finding, which entails checking whether hidden functions (protective devices which can fail in such a way that no one knows they have failed) are still working.

3. Change the physical configuration of a system to reduce the probability of or to eliminate a specific failure process or failure trigger (substitute stainless steel for mild steel to eliminate rust, cut a radius in a corner to reduce the likelihood of fatigue, relocate the pump to a place where the forklift truck cannot hit it, etc.).

4. Change the behavior of the people who interact with the system by training them and/or by strengthening procedures (ensure that maintainers use torque wrenches where appropriate, train forklift drivers to drive more carefully, etc.).

5. Change the design of a physical system or the way in which it is operated in order to reduce or eliminate the consequences of the failure.

back to article

Continue Reading →