Archive | October, 2006


1:54 am
October 2, 2006
Print Friendly

Viewpoint: Taking Aim On Your Business

I still find it truly amazing that aircraft high above the ground or racecars hurtling around a racetrack can have their engines monitored, diagnosed, tuned and possibly repaired from a remote location-often many thousands of miles away. Although this technology has been available for several years, the same is also happening more often now with process equipment.Many types of equipment are monitored and controlled directly using wired or bus-based intelligent sensors and controllers, or indirectly by wireless devices. In addition, it is becoming more commonplace to operate sites or units within a site remotely- especially “skid-mounted” units.

But, how does a site start on the journey to using some of these leading-edge technologies? How does a company know whether it will benefit them or not? Does this level of remote interaction with process equipment require a new and different level of maintenance technician -or can your existing technicians bridge the gap? How would you know? How would you learn?

One way of getting started is to compare your current metrics and practices with companies and sites at the leading edge-using benchmarking.

Benchmarking can be done on an ad-hoc basis between two or more companies, by using a consulting company or by joining a consortium, similar to the one started by ARC Advisory Group. In this type of consortium, companies and sites from a number of different industries share metric data to see how they compare against each other. Example metrics include:

  • Number of Process Control Personnel per I/O
  • Number of Field Instrumentation Personnel per I/O
  • Maintenance Efficiency
  • Key Control Loop Performance
  • Instrument & Analyzer MTBF
  • Instrument & Analyzer MTTR
  • Number of Bad Instrument Measurements
  • Number of Bus-based Devices
  • Use of Asset Management Applications
  • Hours of Maintenance Training
  • Type of Maintenance Training

Metrics-based benchmarking, along with associated best practices, allows these companies to focus in on important issues-to really “take aim” on their business. Showing where the gaps are and how significant they are allows a site to focus in on the important ones and also to see where metrics interact. So, for instance, having a low number of technicians per I/O may be a good thing, if the training and support regimen on the site is at or above best-in-class. Otherwise, it could be just a cost-saving measure that will eventually become a major issue.

The ARC Benchmarking Consortium recently released its Benchmarking Report comparing data from 51 different plants. Staffing levels are just one of the metrics these manufacturing companies are using to compare themselves to others, both cross-company and cross-industry. Each metric is composed of several measures approved by the consortium as being valuable, with clearly defined calculation methods in order to remain consistently measured from plant to plant.

By using benchmarking to take aim on their businesses, consortium members are also focusing on areas of most concern.Moving to the next step, these companies can now put improvement plans into place and adopt best practices that will enable them to move closer to best-in-class. MT

About ARC: Founded in 1986, ARC Advisory Group is a thought leader in manufacturing and supply chain solutions. E-mail:; Internet:

Continue Reading →


1:51 am
October 2, 2006
Print Friendly

Uptime: Fundamentally Rethinking Maintenance And Reliability


Bob Williamson, Contributing Editor

For decades, our “industry” has been bringing in innovations to improve maintenance and reliability (M&R) processes. The list goes on and on: preventive maintenance (PM), computerized maintenance management systems (CMMS), planning and scheduling, various predictive/condition-based maintenance methods (PdM/CBM), reliability- centered maintenance (RCM), total productive maintenance (TPM), autonomous maintenance, life-cycle cost (LCC) decisionmaking and more.

We also have learned, as many manufacturers, facilities and utilities have, that “programs of- the-month” come and go in regular cycles- each one promising to be the “silver bullet” that will outdate all other practices. Unfortunately, as common as these programs are, they seldom work and are rarely sustainable unless they intentionally focus on compelling business results and provide a tangible return on investment (ROI) to the bottom line.

Sure, these programs typically promise an ROI based on proven, logical strategies.What many don’t address, however, are the requisite work culture changes to not only embrace the new methods, but to sustain and then improve on them. Stay tuned.

Current and future workplace demographics suggest a challenging work culture at best. Consider the growing discussion about “maintenance skills shortages” and the need to train more maintenance and reliability technicians and professionals. We DO need to train more-and use new and proven maintenance methods leading to lower-cost operations with more reliable equipment.

We also need to fundamentally rethink our M&R strategies as we approach this “perfect storm” of increasing retirements, growing labor shortages, lack of vocational/technical training programs and the “college-educated workforce” promoted by our social/academic community. Business decision-makers who perpetuate the myth that “maintenance” is little more than an overhead cost will increasingly struggle to remain competitive. This is (and will continue to be) especially true in equipment-intensive, capital-intensive businesses.

Where to start our rethinking
. . .In most cases maintenance is not an “industry,” nor should it be expected to improve its performance to grow business, profits or customers, or to prevent lost revenues. Yes, maintenance does produce capacity for the operation to generate revenues at lowest possible cost, but it can’t do that alone. Yet, to view “maintenance-as-an-industry” sets the stage for a blocking assumption-we can operate fairly autonomously to improve our performance.

Many, if not most, of the reasons equipment does not do what it is supposed to do are outside the direct control and responsibility of the maintenance organization. For example, we have all seen very effective PM programs die on the vine because of no access to the equipment at the right time for the right duration with the proper spare parts.

Second. . .We must admit that we actually are “partners” or “joint owners” of asset reliability because (again) “maintenance” cannot do that alone. The maintenance group is generally part of a larger business organization-not an autonomous, stand-alone business.

For a manufacturing-, utility-, transportation- or facility-type of business to be successful (market-responsive, agile, low-cost and profitable) its assets (equipment and facilities) must perform as intended first-time, every-time. This means the business must focus on improving ALL groups that affect asset performance and reliability.

Consider the impact of other groups on your M&R efforts: design engineering; installation, startup and commissioning; procurement/purchasing; process engineering/control; quality control/inspection; MRO parts & supplies; operations; human resources/training; safety & environmental and others.

How can a maintenance organization be responsible for improving equipment performance and reliability without fully engaging these other groups? Does this explain why many maintenance improvement programs have failed to deliver sustainable results?

Gaining a new understanding
The sooner that our business decision-makers truly understand how equipment-intensive operations generate revenue and profits, the more competitive their operations will be. On the surface it does not seem too difficult to understand. But, it’s easy to see why there is a disconnect when you consider the amount of time these decision-makers typically spend dealing with equipment performance cause-and-effect improvement compared to the “glitzy” programs that continue to swirl around out there.

It’s time for decision makers to unite! Let’s get our plant managers, general managers, executives, boards of directors and company owners to “think inside the box” for a change, and discover what truly affects asset performance and reliability. Then, let’s encourage them to take decisive leadership action to focus the typically separate groups’ activities on eliminating equipment losses and problems in cross-functional team approaches.

The leadership behaviors we see in NASCAR Nextel Cup teams should serve as a model. If a team’s equipment (racecars) are poor performing and unreliable, not only do their costs increase, they lose races and sponsors- the equivalent of losing markets and revenues because of higher costs and unreliable ontime delivery.

An equipment-intensive operation must have reliable equipment to compete.Maintenance, being less than 10% (or so) of the organization, cannot overcome equipment problems that emanate from the other 90% of the organization. If we expect maintenance to do it alone, we are liable to become a highly reactive, repair-based operation with increasing interruptions, costs and lost revenues. If we want to make our plant (or facility or utility) a more desirable place to work, we all MUST focus on eliminating equipment problems. Ponder that for a while. . .

In a work culture where everyone who directly and indirectly affects equipment performance and reliability focuses on preventing- even eliminating-equipment problems, there is less finger-pointing, less blame, less frustration. And, fewer maintenance technicians, maintenance specialists, and reliability technicians will fall prey to the “fixing things fast” syndrome. In reality, with fewer equipment problems and more reliable equipment, more real maintenance work can be accomplished with fewer people than in a highly reactive maintenance environment.

Procedure-based maintenance training
It’s now time to beat the TRAINING drum as loud as I can! Without formal structured training, workers at any level are left to their own devices or assumptions to figure “it” out. This is NOT a way to operate a competitive, safe, environmentally-friendly, profitable business, be it a manufacturing plant, commercial, residential, resort,medical or academic facility or utility (i.e., electricity generation, water treatment, wastewater treatment, telephone). Sadly, many companies have given training short shrift-years of down-sizing and cost cutting have taken a real toll.

For example, experience has shown that detailed, procedure-based operations training results in error-free production. Maintenance training, though, is based on the assumption of proficiency in a skilled trade or craft, with little use of detailed procedures. Back in the days of sound apprenticeship training under the guidance of a Master Craftsman, this type of strategy worked. Today, however, without apprenticeship training and without being mentored under the tutelage of Master Craftsmen, how can we expect our maintenance workforce to ever be proficient and effective using out-dated craft-based approaches to completing their assigned tasks.

Now is the time to embrace procedurebased maintenance and to use those same procedures to train and qualify our maintenance technicians and mechanics.We need to move people away from simply “figuring things out” into the mode of “following the proper procedure.” In an advanced manufacturing environment, in a reliable utility, in a first-class facility this makes sense. Do this and we can open up the door to many more people to enter maintenance and reliability as a career.

Public schools
Now, too, is the time to again focus on two tracks in our public schools: academic/college bound and career/technical education. Both can be accomplished in our school systems, just as they were in the past. Educating and training students for post-secondary success can be done at a college,
university, technical school or on the job.

Teachers, counselors and academic leaders should be encouraged to reflect on the success rate of their graduates.What’s wrong with 50% of high-school seniors going on to four-year colleges or universities, 40% going to post-secondary technical schools and 10% going directly into the workforce?

Business and industry must implement various programs or initiatives to attract students’ attention while they are still in high school. Co-op programs, apprenticeship programs, school/work programs introduce students to the world of work while they are in a position to be thinking about career decisions. Business and industry must actively share behind-thescenes activities with the community, schools, teachers, students, and parents.

Partnerships for reliability
M&R professionals must master “partnering” skills in the workplace. Communicating the causes of poor equipment performance and equipment-related losses without “blaming” can go a long way toward improving organizational performance. Collaborating on countermeasures that eliminate the root causes of poor equipment performance and contributing to best-practices procedures will lead to worldclass levels of reliability.

At the core
Finally, it is time to fundamentally rethink maintenance and reliability as a core business process in equipment-intensive operations. The key is to create partnerships-or teams-that abhor unreliable and poor-performing equipment and facilities.

Much of our future pivots on a precarious pinpoint axis of reliability. How much longer can the maintenance organization alone control this balance? MT

Continue Reading →


1:47 am
October 2, 2006
Print Friendly

Predictive Maintenance Strategy from Rockwell Automation Helps Energy Supplier Maximize Output and Meet Growing Demand

Predictive Maintenance Strategy from Rockwell Automation Helps Energy Supplier Maximize Output and Meet Growing Demand

Genesis Energy’s predictive maintenance strategy targets most critical equipment to ensure optimal availability

With breathtaking terrain ranging from snowcapped mountains to lush lowland plains, New Zealand is often described as a paradise by those who have experienced its unique beauty. Located approximately 2,000 kilometers east of Australia across the Tasman Sea, New Zealand’s isolated location and rich natural resources have fostered a self-reliant culture.

Unable to tap into the power generated by neighboring countries, New Zealand must locally produce the electricity to meet its consumer and industrial needs — which in 2001, was approximately 34.88 TWh . As the country’s industrial sector continues to develop and the population continues to grow, so does the demand for electricity. In fact, New Zealand’s power generation capacity is continuously strained by ever-increasing demand.

Tasked with keeping the supply side of this equation in proper balance is Genesis Energy, New Zealand’s largest provider of natural gas and electricity. By investing in new facilities and technology upgrades for existing facilities to increase capacity, Genesis is addressing the long-range needs of its island nation. However, that strategy doesn’t address the challenge the energy provider currently faces. If a major interruption in production occurs due to equipment failure at any one of its facilities, Genesis could be forced to purchase energy from other suppliers at the current spot price to make up the short fall and puts the company at risk for financial penalties imposed by the system. Loss of a typical Hydro unit could mean a loss in revenue of between $40,000 and $1,000,000 per day depending on the time of year and the spot price. As a result, maintaining power availability and optimizing the generation process is a core business goal. Through a reliability-centered maintenance (RCM) program supported by Rockwell Automation, Genesis can predict and prevent failures from occurring and extend the life of capital assets.

No Room for Error

  • One of three state-owned enterprises, Genesis supplies 20 percent of the country’s electrical needs through a diverse electricity generation portfolio, which includes Genesis’ flagship thermal facility, the Huntly Power Station, five hydro power plants, and various wind farms and cogeneration facilities at large industrial sites. With the majority of Genesis’ output generated from the Huntly Power Station and the hydro plants — some of which have been operating for more than 60 years — keeping these facilities properly maintained and operating at full capacity is key to achieving its business goals.

Huntly — With a current output capacity of 1,040 MW, Huntly is New Zealand’s largest power station. The facility consists of four separate conventional boiler and steam turbine generation units, capable of burning coal, natural gas or a combination of the two. In 2005, the 22-year-old facility recorded 84 percent availability, but as the plant continues to age, higher levels of maintenance are anticipated to meet a sufficient level of production output. Recently installed on the same site is a 40MW simple cycle gas turbine generator

As part of its growth strategy, Genesis is building a high-efficiency combined-cycle gas turbine power plant, which will increase production capacity at the site to 1,425 MW. It is also retrofitting the existing control and instrumentation system — which involves migrating one unit from analog to digital controls during the 2005/2006 shutdown and the remaining 3 units in the next three years.

Hydro — Approximately 60 percent of New Zealand’s electricity is generated by hydro production. Within Genesis, the company’s hydro generation capacity consists of five power plants operating from three remote sites within the country. Commissioned between 1923 and 1983, and with a production capacity of 498 MW, these plants continue to serve as a vital source of electricity for the country. Because of their geographic isolation, several of the hydro power plants are controlled and monitored from other locations.

Formulating a Maintenance Strategy

In 1999, when Genesis was formed out of the Electricity Corp of NZ, New Zealand was experiencing an energy surplus, so the need to prevent downtime wasn’t as critical for Genesis. As a result, the majority of the company’s maintenance efforts were focused on preventing major catastrophes. However, as demand changed in subsequent years, so did the role of maintenance. Today, across the organization, Genesis engineering and maintenance personnel are focused – around the clock — on ensuring maximum plant availability.

“At Genesis, improving performance is not just the responsibility of the maintenance personnel but also engineers and operational staff,” explained Simon Hurricks, machine dynamics engineer, Genesis Energy. “We work together to share information, prioritize activities and identify potential issues. As a result, the decisions we make have a greater impact on production capacity and performance.”

Genesis is investing heavily in maintenance tools, technologies and personnel. For the greatest impact and return on investment, the company has adopted a maintenance strategy that seeks to maximize asset performance by applying the right activity to the right asset at the right stage in its lifecycle .

“Because maintenance activities can be tied directly to production output, our goal is to identify and plan for maintenance needs in a way that best optimizes production and extends equipment life,” said Hurricks.

In developing its maintenance strategy, the company sought to incorporate an optimum mix of predictive, preventive and reactive activities that corresponds to the criticality of the equipment, the failure modes and the costs associated with failure. Using a reliability-centered approach to maintenance, the type of maintenance activity is determined based on the overall impact and cost of downtime resulting from a failure. (During winter, the high demand period, there is virtually no spare generation capacity in New Zealand so loss of a generator has an immediate consequence for the whole country. The generators must be available and reliable).

This strategy places an increased focus on using predictive and preventive techniques on core production assets and their supporting auxiliaries, many of which have 100 % duplication but a failure increases the risk of production loss. On small low cost non critical plant a run to fail approach can be adopted.

Combined Effort

Within Genesis, Hurricks is part of a core group of engineers and maintenance personnel intimately involved in the development and implementation of the company’s maintenance strategy. Before any maintenance activities are determined, a team of Genesis engineers and maintenance personnel evaluate each phase and element of the production process at each of its facilities to determine the criticality and the probability of failure. Using a combination of technologies, including vibration and oil analysis, Genesis conducts an exhaustive evaluation of each piece of equipment.

The team looks at all potential failure modes to determine the risks for each, possible downtime costs, and potential safety concerns to outline failure scenarios. It then determines whether failure detection is possible and the types of technology necessary for detection. The most critical element of this risk assessment process is estimating the cost of failure, the replacement cost of the equipment, the potential damage to other equipment, and the financial ramifications of lost power generation.

“The wide range of people involved helps ensure we have a balanced perspective in terms of how we address and respond to different scenarios,” Hurricks said. “This cross-team collaboration and input helps to balance our decision making so that we’re considering both or immediate and short-term needs, as well as our long-term production requirements.”

Once the assessment is completed, various points of data are inputted into a reliability-centered software program (available commercially and installed by Genesis) for more detailed analysis.

Hurricks estimates that predictive activities that measure the condition of equipment, such as vibration analysis, oil analysis and thermal imaging, represent nearly 60 percent of Genesis’ overall maintenance activities. The predictive techniques are primarily focused at the Huntly power station where approximately 400 pieces of equipment (mostly rotating equipment) are monitored, including boiler fans, boiler feed pumps and auxiliary generation units. At the hydro plants, predictive technology is used to monitor the main generators.

“Before there was a lot of unnecessary routine strip down (preventive) maintenance carried out, which is both a waste of resources and does not prevent failures,” said Hurricks, “Today, the predictive tools enable us to be more strategic and planned in our approach. The beauty of predictive maintenance is that you’re no longer caught napping when disaster is rapidly approaching. The value this technology provides is tremendous, particularly when the fault has the potential to reduce the generation capacity at a time when the spot price is high.”

Solving the Issue of Isolation

The remote location of the company’s various hydro plants posed a unique challenge for Hurricks and his team. If a failure occurred at one of these plants, it could take up to six hours to drive to the location and assess the situation. In some cases, production at the facility could be down for days before the problem is corrected.

With more than 34 years of experience in the field, Hurricks has dedicated his career to understanding the science of predictive maintenance and is well-versed in the latest technologies and strategies for keeping a plant running at peak performance. After reviewing all available options, he determined that an online vibration monitoring and protection system would best meet Genesis’ needs. More specifically, the monitoring system needed to be user-configurable and able to store data for post-event analysis. It also needed to be compact and easy to install and expand.

At first, Hurricks wasn’t sure if the technology was available that could meet his specific condition monitoring requirements. That was until he discussed what he needed with Colin Gracie, president of Inspyre Reliability Solutions, an independent sales engineer, who told him about the unique capabilities of the Allen-Bradley XM Series monitoring and protection system from Rockwell Automation.

“When I first heard about the unique attributes of the XM Series, I immediately saw the possibilities for the technology to address our needs,” said Hurricks. “Of particular interest was its ability to provide diagnostic protection and real-time data, as well as its ability to be easily integrated into our existing infrastructure.”

Equally important in this case is the ability to monitor the equipment from the various isolated locations. By connecting the equipment to a wide area network, Hurricks and his team would be able to analyze data from these remote plants and identify problems far in advance of a failure. And as an added benefit, the time normally spent driving to the individual plants to gather vibration readings could be better used for other maintenance activities.

Installation of the XM Series is scheduled to be completed in early 2006 on 13 generators at the company’s five hydro power plants. At the Huntly power stations, the XM modules are monitoring 11 cooling tower fan drives and two 1.3-MW pump motors and the 40 MW gas turbine generator.

The modules will also monitor the larger BOP (Balance of Plant) system on the plant’s new 385 MW combined cycle gas turbine unit. Just on the hydro plant equipment alone, the system will collect more than 800 points of data in a fraction of the time to manually collect the information.

As part of the upgrade, the company replaced its analog network with a digital network, which allows for more cost effective remote analysis and allows Genesis to more easily expand to more plants using only one server and database. A server installed at the Huntly facility communicates to the XM modules via a wide area network. The data in the modules is downloaded according to a programmed schedule – every five minutes for normal data (within specifically defined parameters), every ten minutes for triggered data and every 24 hours for transient data.

However, just because a problem gets diagnosed, doesn’t necessarily mean that there is a need for immediate action. The predictive technology enables Genesis to identify a potential failure before the problem affects productivity or performance of equipment. It can then track progression of the fault and schedule the repair or replacement when it is convenient.

As part of its maintenance strategy, Genesis also performs preventive maintenance on a time-based or convenience basis depending on the type of equipment, performance specifications and operating conditions. Hurricks uses traditional predictive maintenance techniques — vibration and oil analysis, thermal imaging and ultra-sound signature analysis — to monitor various parameters on a preventive basis. These tools complement the predictive maintenance tools that Genesis employs.

For example, oil analysis checks the percentage of metal in the oil used to lubricate gearbox bearings — a symptom of metal fatigue or excessive wear. If metal is reported in the oil, maintenance can more closely monitor and trend equipment operation to determine the root cause and take corrective action before affecting production. Hurricks uses thermal imaging to detect hot spots in rotating equipment and ultrasound monitoring to detect changes from the norm, which would trigger the need for closer analysis.

“Using a combination of predictive and preventative maintenance, we can more accurately target the work that needs to be done during the annual shutdown,” said Hurricks. “With the trending data we collect, we can strategically go in and make the corrections or change out equipment. This allows us to make more effective use of our time during the shutdown.”

With the reliability-centered approach to maintenance, Genesis has greatly reduced the amount of reactive maintenance performed. Today reactive maintenance represents only 10 percent of activities. For equipment not determined to have a high degree of criticality and low replacement costs, Genesis does not perform routine maintenance but simply replaces or repairs the equipment when obvious problems occur.

“With 70 maintenance personnel covering six major energy production facilities, along with numerous cogeneration facilities at industrial sites scattered across the region, we have to prioritize our activities,” explained Hurricks. “We’ve calculated that the capital expense of replacing non-critical equipment when it fails is evenly balanced against the cost of implementing a predictive or preventive program for this equipment.”

Even before the company’s latest predictive equipment was completely installed, the XM Series modules demonstrated their ability to quickly detect and diagnose equipment failures.

“Shortly after we installed the 40 MW gas turbine unit, it unexpectedly tripped on high vibration” said Hurricks. “Since it was still under warranty, the manufacturer insisted that a full inspection, taking several days, was undertaken. While waiting, we decided to install the XM Series system as an informal test of the technology. Following the inspection which found no obvious problems, the machine was returned to service. The high vibration was still apparent. Looking at the spectra available from the XM120 it was immediately obvious that the high vibration was in fact a transducer fault. Further investigation showed that one of the vibration transducers had a broken connection and further more it was found that the transducers on the turbine were cross connected. If the XMs had been installed at the onset we would have saved several days of down time and paid for the XM installation”

As the XM Series continues to prove its value, Hurricks anticipates that there will be other opportunities to apply the technology through the company’s various power plants. If early indications mean anything, the XM Series will prove to be a valuable tool in Genesis’ highly effective predictive maintenance program. MT

Continue Reading →


1:43 am
October 2, 2006
Print Friendly

Steps To Alter Our Manufacturing Culture And Solve The Reliability Paradox


Robert Taylor, SAPPI Fine Paper North America

An interview with Robert “Bob” Taylor of SAPPI Fine Paper North America…

We had an opportunity to catch up recently with Robert Taylor of SAPPI. He had authored a remarkably candid analysis of the state of reliability in manufacturing in the December 2003 issue of MAINTENANCE TECHNOLOGY and we thought it was about time for an update.

MT: Bob, three years ago (12/03) you published an article in MAINTENANCE TECHNOLOGY entitled ‘The Reliability Paradox’ based on a presentation you made at that year’s SMRP Conference. In your article you outlined 10 reasons why there is a gap between what we know and what we do, in terms of reliability at our manufacturing sites. It was quite an impassioned articulation of our resistance to change and a real call to arms for all reliability professionals.
RT: It has been my experience that we are failing to recognize or we are overlooking the potential competitive advantage offered by reliability—not only in improving the capacity of our assets, but also in operating them at a significantly lower cost. At our company, SAPPI Fine Paper North America, we have moved beyond this diagnosis phase and have identified actions to address the ‘Reliability Paradox.’

MT: Would you mind sharing those actions with the rest of us? One or two of us might also be passionate about improving reliability.
RT: Of course, as I stated in the root cause analysis phase, leadership is key to making the necessary changes that are needed to guide an organization to improvement. I‘ve called these the ‘Five Basic Leadership Steps to Alter Our Manufacturing Culture and Solve the Reliability Paradox.’

Step #1 is “Be Humble – And Learn!” I quote the Greek-born, Roman philosopher, Epictetus, to help illustrate this key to success. “It is impossible for anyone to begin to learn that which he thinks he already knows.” This may be particularly insightful here in North American manufacturing where we tend to believe we have all the answers to all of the problems.Humility, learning and a willingness to accept reality go hand in hand.

One well-documented and widely publicized case history involves a large NA metals manufacturer. The company was facing rising costs due to inflation combined with lower prices from global competitors. Shareholder value was eroding. The formula for success that the company had followed for years, Profit=Cost + Profit Margin, no longer applied. That was the reality they faced so they decided to learn. They conducted global benchmarking research focused upon industry maintenance and reliability practices, predictive maintenance technologies, information systems and reliability methodologies. From this research they developed an improvement plan and they acted. Today, that company has moved from a very reactive maintenance response to a much more proactive response. They have increased their equipment reliability from an average of 78% to over 91%. They also increased their quality from 76% to 91% while reducing their workforce (through attrition) by 44%.

This company was humble, they accepted reality and they learned from others.

MT: That’s a great example Bob. So benchmarking plays a big role in the improvement process?
RT: Actually, Step #2 is “Know What Good Looks Like” and very much depends upon benchmarking. I like to call it ‘benchmarking plus’ because the research should identify results, practices and processes so that leaders of the reliability improvement process can learn and visualize what ‘good looks like’.

MT: Do you have some examples of results, practices and processes for our readers?
RT: Results are the easily obtained indicators. You have seen them in magazine articles, from consultants, and from various available data bases. Table I reflects some examples.

In terms of identifying poor practices, that takes a little more effort and observation. Some examples would be obvious deterioration in pump bases; excessively leaking seals; covers missing on lube containers; craftspersons queuing up at a stores counter; bearings running hot from poor alignment/balancing /lubrication; operations where emergency work is the norm and many, many others. Taking pictures now and then of examples of poor practices is a good visualization tool.

Perhaps the most difficult part of this endeavor to understand what good looks like involves the business processes, or the lack thereof. I’m talking about extending reliability beyond the maintenance department, for example, where there is a collaborative environment to improve reliability among all of the functions in the mill, including operations, HR, engineering, procurement, or, another example of a process, justifying the reliability efforts in financial and business terms, the risk and reward story.

MT: Okay, so now that we have established the need and the objective, what’s next?
RT: I like to think that Step #3 is to “Maintain High Expectations.” It is vitally important that the reliability leaders must know what good looks like, so that they can challenge ‘less than good.’ To relate this concept to something we are all doing today much better than we did in the past, consider your safety improvement program.You do not allow ‘poor’ safety practices or conditions in your mill today. You maintain high expectations and communicate those expectations consistently and universally. It’s all about leadership and the unwillingness to accept less than what ‘good’ looks like. We should adopt the same expectations on reliability. MT: Excellent point.We’ve heard of one company CEO who, when presented with goals concerning improved reliability, responds as follows: “What is your safety and environmental conformance? Don’t talk to me about improving reliability until you’ve proven you can improve and maintain your safety and environment.” This sounds as though that challenge is related to ‘expectations’ and that he establishes his acceptance level right away. RT: Yes, what I’d call ‘never ignore a poor practice,whether it’s safety or reliability, because doing so will immediately lower the standard’. The fourth Step (#4) is “Be Passionate About Reliability.” Leadership has to be felt, it has to be animated and it has to be enduring. There’s a saying I like to use as part of this discussion. “What gets talked about—what gets measured— what gets recognized and rewarded— what gets personally demonstrated— IS what gets done!”

MT: We can’t imagine that anyone who knows you questions your passion concerning reliability.
RT: This passion for reliability is simple. If you truly believe in something you have to crank it up because ‘normal’ just does not get the job done. Leaders have to live the vision, 24/7/365. It has to be foremost in their minds, in their manners and their actions.

MT: What’s the next step?
RT: Last—and by no means least—is Step #5: “Be Courageous.” The opposition will come from every quarter, as it does with any change.You can expect the ‘not-invented-here’ crowd to state their position, not to be outdone by the ‘we’vedone- it-before’ skeptics, who are closely followed by organized resistance from represented workers and the individual employees displaying and voicing caution. Perhaps the strongest resistance will come from those who you believe to be in your own camp, the apprehensive supervisor, who, as yet, fails to see the benefit from all the troubles he or she imagines. And, always there is the pressure from above, the impatient executive barely able to wait for results before questioning the wisdom of staying the course. So, yes, expect opposition and resistance from several varied sources—and remember that it takes courage to stay your course. This is perhaps the most challenging time. It is this phase of the improvement program that requires perseverance and communication. You, the passionate champion, have to be accessible, logical, unwavering and, yes, courageous under what may be withering attacks.

MT: You established the root cause reality in your earlier ‘Reliability Paradox’ article for us, and now you’ve established a five-step plan for resolving that paradox. Clearly, you’re living the reliability improvement plan at SAPPI. Can you share with us the timeline you have followed—as well any of the benefits you have seen thus far?
RT: We have spent considerable effort within SAPPI establishing a solid business case for reliability improvement as past experience has demonstrated that without this business case (i.e. the impact of reliability—or lack of it—calculated as to P&L impact from both lost sales and extra cost of manufacturing) it is impossible to rally the full organization behind the effort.Having this firmly in place we are now fervently in the process of establishing and communicating what good looks like at all levels of the organization. We are having some initial success on improving key reliability process measurements— as well as P&L impact— from this effort but will need another 12- 18 months to establish firm trends.We have seen enough, however, to realize that the benefits are real and they can and will be realized.

MT: Bob, thank you so much for your guidance in the five basic steps of leadership needed to solve the reliability paradox. This can serve as a road map for improvement for any manufacturer. MT

(Editors Note: Bob Taylor is VP- Manufacturing for SAPPI Fine Paper North America, a leading manufacturer of coated fine paper at four mills located in Maine, Michigan and Minnesota. A long-time advocate of reliability, Taylor has spoken on the topic before many organizations over the years. MAINTENANCE TECHNOLOGY thanks asset management expert John Yolton, maintenance strategy consultant for SKF’s Global Pulp & Paper Segment, for his assistance with this interview.)

Continue Reading →


1:41 am
October 2, 2006
Print Friendly

Increasing Plant Uptime

Are you actually measuring your downtime? Even if you are, you might be missing opportunities that help beyond the correction of individual downtime events.

It’s 2 o’clock on a Saturday afternoon. You’re the shift production supervisor, and you get a call from the press operator station. “We just had an overload trip on number four press pump,” the voice says. You respond: “We had this problem yesterday, too. Let’s get maintenance down there and look at the pump.”As you hang up, you’re thinking to yourself that the plant sure has a lot of problems with pumps.

Downtime information is essential to correct ongoing machinery problems and deficiencies, and to fine-tune maintenance and operations management systems. Many facilities, though, still do not measure downtime. Even if they do, they often miss opportunities that can help the plant in larger ways than correcting one downtime event.

Downtime costs plants millions of dollars each year in lost production, downgrade and loss of customers. That’s why it is so important to know what’s causing the downtime and how to use this information to correct the problem.

Downtime collection
There are various methods used to track downtime. The simplest is where an operator merely fills in a log book, noting what happened, what was done about it and how long production was down. This is where many older plants started–and where some of them have remained. Many of them still are not measuring their downtime.

As industrial plants started to mature in the 1970s, many switched from using logbooks to adding downtime details on production forms that were collected at the end of the shift. These forms were kept on clipboards and made available for plant personnel to read.When a clipboard got full, the forms were filed.When using logbooks and forms, trending typically was not reviewed except for a month-end report that listed the total hours down. Sometimes, plants would separate the maintenance from operational downtime, maybe even by craft (electrical vs. mechanical) and, if they were clever enough, by equipment area such as press, former, drying, etc. Scheduled and unscheduled downtime would be tracked as well. It is important to evaluate both scheduled and unscheduled downtime to attempt to reduce each.

In the 1980s, the beginning of the computer era, plants started to use spreadsheets and databases to track downtime. In the first “computerized maintenance” years, many plants collected downtime from forms filled out by operators.Administrative personnel would fill in “electronic” spreadsheets and databases from these hand written forms. This allowed for misinterpreted information-which often resulted in misrepresentation of root cause.

By the 1990s, computers had become much faster and less expensive. The spreadsheets improved and some plants had operators inputting data into home-brewed downtime databases or were using software sold by various companies. It wasn’t until the mid ‘90s when computers were extremely fast and had large memories that plants really started to understand the importance of good downtime data.We then saw plants use more sophisticated databases to track downtime.

Many plants subsequently began evaluating Overall Equipment Effectiveness or OEE, which is the true cost to the plant. The overall performance of a single piece of equipment (or even an entire plant) is governed by the cumulative impact of the three OEE factors:

  1. Availability (or downtime)
  2. Performance rate (or optimum production rates)
  3. Quality rate (or downgrade)

OEE is a percentage derived by multiplication of these three factors.

Plants now can buy computerized maintenance management software, CMMS/EAM or other Web-based and HMI systems that can report real-time OEE or downtime information for instant management control. Software packages are available to connect to equipment controls to indicate precise time and device information. These same controls also can track OEE.While some maintenance software systems now have downtime tracking capability, some plants still rely on their own database packages. There also are plants that collect no downtime information at all.

Benefits of downtime analysis
Downtime tracking and analysis is reactive. Something happens, and we do something about it-after the fact. On the other hand, through the development of good maintenance and operations programs, downtime can be reduced.

Many industries, however, have not advanced their maintenance technology to the point where it is possible to operate without downtime.As an example, there are plants that are working toward a yearly goal of 97% for 365 days. For some industries and plants this may seem impossible, while for others 97% is terrible, and any unscheduled outage simply cannot happen.

Currently, a wood products plant is considered to be running well if it has 95% total uptime. That includes all scheduled and unscheduled downtime. A 95% rate is 22.8 hours per day. Even at 95% uptime, the lost time during the year represents a substantial decrease in possible profit margin. Short duration, repeat offenders will cause downgrade of product. Most continuous process plants are meant to operate all the time with scheduled, proactive maintenance. If the plant is up and down all the time, not only is there loss in production, but there also can be product quality, safety and environmental issues coming into play. Good downtime analysis will help both maintenance and operations in determining the root cause of nagging problems.

Effective downtime collection and analysis
For downtime information to be effective, the data must be easy to enter and understand, and must include enough detail to allow good root cause analysis. The latest automated systems will collect accurate information if enough effort is placed into monitoring the correct components and if the operations and maintenance personnel add their comments.

Be aware that the right information needs to be collected and entered. The operator needs to add the time that production stopped and started. If this goes past a shift, the next operator should enter the startup time. Items that should be recorded include:

  1. Stop/start times to the minute
  2. Operator name, shift and crew
  3. Plant area(s) affected, such as: Lathes
  4. Plant equipment shut down, such as: Lathe #1
  5. Equipment area such as: Lathe Spindle, and the equipment identification code or number
  6. Sub-equipment, if known, such as: Lathe Hydraulic Pump #2, and the equipment identification
  7. The component that failed, if known, such as: The pump itself or, better yet, the pump front bearing
  8. Failure code, such as: Tripped, Stopped and Jammed
  9. Reported problem, such as: “The pump overloaded and kicked out.”
  10. Action, such as:Welded, Replaced, Filled and Cleaned
  11. Shift maintenance review approval. Someone from shift maintenance during which the event occurred reviews the downtime entry and approves the details, or has further comments.
  12. Maintenance comment example: “After resetting this motor overload and restarting,we noticed high vibration from the front pump bearing.We checked the pump bearing temperatures and vibration level and they exceeded safe operating conditions, so we shut down and replaced the pump.”
  13. Shift supervisor review approval
  14. Shift supervisor comments
  15. Maintenance management approval
  16. Production management approval
  17. Work order number for this event
  18. Root cause: Lack of lubrication
  19. Root cause program failure: PDM
  20. Root cause program failure note: “This pump had not been identified as requiring vibration analysis.”
  21. Follow-up required: “Add pump to vibration analysis route. Repair pump removed.”
  22. Follow-up work orders: There may be more than one.

What a list! And to think we started downtime tracking by simply entering some details in a logbook.Not all of these listed items are required, but the more you document, the easier it will be to determine the root cause.When using a CMMS/EAM or database, drop-down choices can be selected to speed up the process of selecting the various options. Options should be parent/child driven, such that when you select “pump” as the component, there are limited choices for pump failures. The same holds true for equipment. When the press area is selected, only the press equipment and its sub-equipment should be listed as dropdown choices.

What to do with the data
Now that you have collected data, what do you do with it? Hopefully not what we did with the paper forms we collected before-when the clipboard got too full, we tossed the forms or, maybe, put them in a box to store somewhere.

It is good to review downtime daily and assign someone to correct the issue. Unfortunately, we often get so tied up in looking at the day-to-day issues and not finishing what we started yesterday that we lose track and never get back to solving the root cause of larger problems.With work orders, we at least have a better way of tracking these opportunities. But, what do we do with the history?

The key to preventing downtime lies in its history, as long as the right data has been collected.We need to know the following from the data collected:

  1. When it happened: Date and time
  2. How long production was down
  3. What plant area, equipment, subequipment and component failed?
  4. Who was involved?
  5. What was the root cause and solution?
  6. What type of program failed? Was it due to PM, training, management decision, improper engineering, improper installation or poor design? What caused the component to fail and forced the plant to shut down?
  7. Is this a repeat offender? Have there been multiple events of the same problem? How many times? Is there a trend?
  8. Is it happening at a certain time or season? Is there some typical frequency?

By using a CMMS/EAM or database, charts can be developed to show trends that can lead to root cause analysis and solutions. Don’t overlook scheduled downtime when analyzing downtime, either. Chart downtime as follows:

  1. By year
  2. By month
  3. By day
  4. By crew
  5. By shift
  6. By plant location, such as: Log Yard, Press Line #2, Finishing, etc.
  7. By equipment, such as: Press Loader, Core Flaker, etc.
  8. By component, such as: pump, motor, switch, gearbox, conveyor belt, etc.
  9. By failure code, such as: overload, tracked off, spark detect, etc.
  10. Root cause program failure: PDM, Training,Resources,Engineering/Design, etc.

With good information on downtime, problems can be solved and downtime reduced.When a downtime event happens, such as a pump bearing failure because of misalignment, you need to not only resolve that pump/motor issue, but also look at other alignment issues with similar pump/motors, and the alignment program for the complete plant. By identifying a potential program failure, such as the PM procedure of checking for misalignment, and correcting that for the entire plant—not just the one pump that failed—you will solve many more problems and reduce downtime quicker. It is far better to solve the overall maintenance program problem of this PM, than fire-fighting and chasing misalignments each time they occur.

You can’t work on everything at once, so don’t try.You must prioritize those items that cause the most downtime hours and the most events. Select the top three for each—then solve them. Select the top three downtime events by hours, and the top three for frequency for the plant, by plant area, by component and program failure. You may have other downtime problems you resolve right away to keep running, but you must have an ongoing list of priority downtime-related projects in front of you to reduce downtime. Then, when you have solved one, add another to the list.

Gaining a thorough understanding of its downtime can help your company increase uptime and profit margin. MT

George Meek began his career in the ’70s as an electrician. Today, he is a process specialist focusing on hands-on engineering and maintenance projects with Evergreen Engineering, Inc.Headquartered in Eugene, OR, Evergreen specializes in industrial engineering and maintenance consulting for industries worldwide. E-mail:; telephone: (541) 484-4771; or Internet:

Continue Reading →


1:39 am
October 2, 2006
Print Friendly

Practicality In Plants, At A Time When It's More Important Than Ever Before

For a certainty, physical asset management is one of the keys to plant profitability-perhaps even the most important one.Much has been written about the topic, but John Mitchell’s Fourth Edition Physical Asset Management Handbook again follows the author’s tradition by giving both focus and direction to a seemingly overwhelming and broad subject.

Mitchell’s handbook does not fit the common finding “when it’s all said and done, more has been said than done.”By showing what the best of the best have been doing, this outstanding text goes well beyond the general philosophizing and adds value by presenting considerable experience-based detail. In doing so, the book again exceeds expectations by presenting facts rather than mere general concepts. It thereby allows others to become doers instead of philosophizers.

In this text, definition, objectives, benefits and opportunities are first outlined and then explored, explained, and analyzed for the serious maintenance manager and reliability professional. With outstanding clarity,Mitchell’s handbook explains program necessity and optimization principles, major program elements and current best practices, implementation examples and financial results. Probably the foremost expert on asset management,Mitchell selected 10 contributors who clearly understand both theory and practice. He then organized much of their collective work into a true handbook that delineates benchmarking and data gathering issues, condition assessment technologies, fundamentals of lubricating fluid analysis and asset optimization case histories from companies representing petrochemicals, power generation, pharmaceuticals, and others. Time and again, this neatly revised and thoroughly updated handbook emphasizes the practical aspects of asset management–and does so at a time when practicality is more important than ever before.

In essence, John Mitchell again proves that he’s uniquely equipped to combine decades of practical experience with feedback from the participants of his popular public and in-plant asset management workshops in the United States and many other countries.He obviously achieved his goal of incorporating this feedback into the Fourth Edition text. As was the case with previous editions, Mitchell’s handbooks have become the “Gold Standard”in the field of asset management.

We found the text a true pleasure to read. It will certainly add understanding to managers and reliability professionals interested in optimizing profits by navigating safely through the challenges and trials facing industry today. That being the case, the book should not be missed by anyone aiming for optimized asset performance and sustained profits. MT

Continue Reading →


1:32 am
October 2, 2006
Print Friendly

Lessons From The Crime Scene

1006_expertsystems_img1When an equipment malfunction occurs, you need to do much more than simply “sweep up the glass” and get back to work. Expert systems can vastly improve your troubleshooting efforts.

Law enforcement organizations have this down to a science. Arrive at any crime scene, and you’ll find yourself immediately in the midst of a flurry of activity. After the Under Pressure Food Mart is burgled, the area is roped off,witnesses are gathered together and segregated from other onlookers, fingerprints are being lifted, and suspects may already be in custody.More cops are there to guard the area from accidental or purposeful intrusion.

The amount of resources expended on a major (or even many minor) crime scene can be truly mind-boggling.You’ll find the team leader, who directs general responsibilities. The photographer documents visual evidence, a sketch artist takes descriptions and draws the crime scene, and a number of officers guard the area. Investigators interview people at the scene, while more patrolmen canvass the local residents for more data. Specially-trained evidence gathering personnel process the evidence and ensure the documentation is foolproof. Investigators immediately start researching the backgrounds on suspects, looking for clues in past history.

Why is this immediate effort so massive? When actually analyzed, the number of man-hours invested, equipment expended and depreciated, and the inter-departmental coordination required add up to a hefty wad of cash that the taxpayers must pony up. Of course, this must have been determined to be appropriate, or local law-enforcement efforts would be shut down. Is this initial level of investigation really necessary? In fact, why not wait a few days for everyone to calm down, let the emotions die off? After all, we are hurting the business owner by restricting access to the shop, bothering his customers, even appropriating pieces of his store or inventory. Let him get back on his feet.What makes this worth the effort?

What makes this acceptable is the fact that there is really no other method available that can reliably produce the required results. If the photographer was not there, there would be no record of the actual environment at the scene. Evidence that is not quickly and accurately recorded will be lost or modified, with no hope of retrieval. We could wait to begin researching background information, but this will just prolong the successful completion of the investigation beyond reasonable time-limits. Sweeping up and throwing away the broken glass gets the business up and running, but for how long? Without this process in place, the crime is almost guaranteed to happen again.The stricken store may install bars on the windows, but the criminal still atlarge will just find another way in, or move on to the next store down the street.

The process of determining the cause of an equipment malfunction can often seem as daunting as a major crime scene investigation. It often appears to require expert knowledge about how the equipment was operated, how it was installed, the original design specs, changes in the environment, how it was actually being used, etc. Luckily, with just the right combination of repair expertise, root cause analysis, and corrective action implementation, the process does not necessarily
have to be harder to get more productive and lasting results. The right systems have usually already been purchased and put in place at most production facilities to get the data required for an accurate and detailed failure analysis. Unfortunately, the employment of these resources is not always optimum. A smarter approach to the gathering of evidence, the correct interpretation of what that evidence is telling you and the judicious application of corrective actions will put those expensive monitoring systems to work for you.

The evidence gathering process
Most companies already have many systems in place that can help the troubleshooter narrow down his focus, but often times the data is no longer available. The act of repairing the gear has already modified, moved, or destroyed key pieces of evidence. Although the failure appeared to be minor at first, these data points can be crucial to finding an actual root cause of equipment damage.Where do you get the evidence you need to determine the actual root cause of the failure?

A good place to start is with the equipment operators. How often have you heard (AFTER the gear is down),”Oh, yeah, it’s been doing that for a while,” or “It’s always been that way.” This can be one of the most frustrating times in the life of the maintenance manager, listening to an operator describe in detail the telltail signs that his gear is about to fail.However, at this point in the failure analysis, this is just INFORMATION TO BE GATHERED. The fact that the operator did not inform anyone about the previous abnormalities is yet another data point. Again, this is only data that can be used later for root cause analysis and corrective actions. Do not draw any conclusions at this time.

Some companies have trained their operators to immediately document the conditions encountered at the time of a failure. The data is often written on a standard form or in the operator’s log using an approved format. In either event, the report should include some basic information:

  • 1006_expertsystems_img2Time and date
  • The initial indication of the failure (loud vibration, initial alarm, etc.)
  • Operator’s name
  • Operation being performed (start-up, shutdown, capacity test, etc.)
  • Any alarms, indicators, warnings, or other installed indications, including pressure and temperature of the process
  • Environmental conditions (air conditioning secured for 3 hours)
  • Physical conditions noted (smoke, noise, smell, hot to touch
  • Actions taken in response to the failure This data must be captured immediately upon recognizing the failure and any required actions completed. The operator may be one of your best sources of information, but here caution is required. Although he may have the data:
  • He may not know that he has it. You may have to ask the right questions to get the information you are looking for.
  • He may think he has it. In reality, he may have misinterpreted an indication, missed another indicator, or just not understood what he saw.
  • He may not want you to have it. This is an angle on the investigation that I will not focus on at this point. Just be aware that the motivations of the people you are interviewing for data may not be known, and the answers you get may or may not match up with what really happened

Equipment monitoring records and recordings contain a wealth of information.Vibration monitoring recordings, thermal images, and oil analysis results can all be used to determine the timeline of events leading up to the failure.You may not know what to do with the data yet, but have it available and ready for further scrutiny.

Machinery history and repair records are invaluable. These records can be on paper or in electronic format. They can be used to discover long-term trends in equipment operational status and down-time analysis. Has this happened before? What caused it that time? How did we fix it last time? Did that fix work?

At this point, the usefulness of these records is established by past maintenance practices. Entries in these records that say (more or less), “Process pump #3 down due to pump failure” is much less useful than,”Process pump #3 secured (run hours 2910). Smoke noted issuing from mechanical seal upon initial start-up.Discovered clogged flush line. Line cleared, flow verified, seal replaced and retested.” The second entry contains a wealth of information that can be used for a much better analysis of the reason it failed versus just a single failure datapoint. This entry would probably take the maintenance supervisor an extra three minutes to complete.

When should entries be made in the machinery history log? Best practice is to make a minimum of two entries: one immediately at the initial failure, and one following repair and retest. If further indications were found, special troubleshooting methods were employed, or the troubleshooting was very complex, more entries can be made as required. Bottom line: for electronic recording systems, there cannot be too much data. Paper systems may require a more judicious use of space to prevent an unmanageable clutter, but can still contain a good amount of information.

Another important information resource is the broken piece of equipment itself. It is critical that the troubleshooter look at the failed part to determine not only what broke, but how it broke. The failure mode and failure agents must be determined to find and eliminate the actual cause of the failure.

Sequencing the analysis
The sequence of the data-gathering steps is actually quite important. The operator should immediately write down his indication. The troubleshooter should talk to the operator early on to get his thoughts while it is still fresh in his mind. But when can equipment repair begin? After all, working in parallel to find the cause, while simultaneously preparing for the repair, just seems like good sense. However, this is where an enormous amount of information is often lost, destroyed, or altered. The following example illustrates how working ahead of the analysis can lead to frustrating re-work.

A plant was having its entire main condensate system overhauled. New piping was being installed, and the condensate pumps were to be rebuilt. Work began on the system by removing the pumps and hauling them to the pump shop for refurbishment. Piping in the system was cut out and replaced to correct below-spec minimum wall thicknesses.

The pumps were spec’ed out, rebuilt, and hydrostatically tested in the shop. No issues were found.

Two months after their removal, the pumps were re-installed in the system. The system was filled, vented, and tested one pump at a time. After running for 20 hours, the lower pump bearing failed, as indicated by excessive vibration.

The pump was removed from the system and inspected. The lower pump bearing was found to have failed. The bearing was replaced and the pump re-installed. Twelve hours after start-up for run-in, the bearing again failed.

This time, the ace pump rebuilder was called in. Obviously, someone was not installing the bearing correctly. He had been doing this for years, and would make sure the job was done correctly this time. He personally supervised the rebuilding and retesting of the pump. It was run on a test fixture for 80 hours,with all vibration measurements well within spec. Everything looked fine from his perspective.He saw nothing that he recognized as a problem from his experience.

The pump was again re-installed and retested. The bearing failed for the third time after 20 hours of operation. Each bearing replacement cost over $23,000 just in parts and labor. So far, this equated to nearly $70,000, not including the slip in delivery date, extra time and effort expended by the expert pump supervisor, and extensive pre-installation vibration testing on the third go-around. Yet, the pump was in worse shape than before the overhaul.

Finding the culprit
From this example,with the data you have been given, the cause of the bearing failure will not be obvious. Even the expert is left scratching his head. How do you go about finding the cause of this type of failure?

The sequence of evidence gathering listed above was followed for all three bearing failures. Obviously, there must be something else going on that even the “pump guru” was not aware of or hadn’t thought of. What do you do?

This facility fell into one of the traps that many companies stumble into.Repairs were commenced before the failure analysis was complete. Companies want to get ahead and disassemble the pump, but this can lead to the disruption (or destruction) of evidence needed to determine the cause. But, wait a minute.We determined earlier that one of the most important pieces of the puzzle is the failed component. How can we analyze the bearing if we don’t first disassemble the pump? We seem to need to know the possible causes before we even start the disassembly!

1006_expertsystems_img5This is a great question. It runs to the core of why many troubleshooting and repair scenarios end with a rework of the same failure.

Having an advantage
It’s human nature to seek an advantage when dealing with a problem. So, let’s walk through the above example, using the TapRooT®’s Equifactor® Equipment Troubleshooting module to help narrow down the cause of the failure—even before disassembling the equipment. (Equifactor is a system that has incorporated the troubleshooting expertise of Heinz Bloch into easy-to-use tables that allow the troubleshooter to narrow down the causes of equipment and component failures during the early stages of the troubleshooting effort.)

As shown in Fig. 1, the first step is to diagram exactly what happened. This is done using TapRooT’s SnapCharT® function.

By using this system, a timeline is set up with all the known data incorporated into an easyto- understand format. It may be tempting to skip this part (“I know what happened!”), but this is a crucial step in understanding exactly what happened when.

Now, since this is an equipment-related failure, the Equifactor module is brought to bear. Using its logical tables, most causes can quickly be ruled out, and causes previously not thought of are brought to light.You can eliminate many of these causes right away (the pump had been verified in balance, the shaft was not bent, etc). The possible remaining causes are now known, and valuable data can be brought to the jobsite to find the actual cause. You now know the right questions to ask during the equipment teardown:

1. Is there a misalignment between the pump and motor?

2. Is there casing distortion due to excessive pipe strain?

At this point, you can continue the investigation just like any other. Since you know what to ask, you know what to look for.You can go to the job site and gather the extra data that you need. In this case, before the pump is unbolted from the foundation, you notice the riggers are connecting chain falls to the discharge piping and the pump.When questioned, the riggers tell you that it took chain falls to get the piping aligned during installation, and there will be quite a bit of tension as the flange bolts are loosened.

The root cause
“We found the root cause!” “Those mechanics obviously don’t know what they’re doing and are flexing the pipe (and the pump casing) too much.” “Tell those mechanics to line it up right next time!” Do those remarks sound reasonable? Of course not.Unfortunately, they are the type of responses that are heard over and over again throughout industry.

1006_expertsystems_img6Tell those guys to be more careful.” This has the same effect as telling your son (after he’s run over the family mailbox) to drive more carefully in the future.You’ll get a half-hearted “OK,” and still nothing changes.

While, the root cause analysis is not over,we finally have the information we need to start the analysis. In the TapRooT system, the data gained from this investigation is now fed back into the SnapCharT, and problem factors are highlighted, as shown in Fig. 2.

The highlighted problems are not the root causes, but they are the major indicators that will now be used by the rest of the TapRooT system to find the actual root causes.After completing the investigation and running all these indicators completely through the system, several root causes may be found. For example:

  1. The prints used to fabricate the piping contained a typographical error, causing the incorrect piping length to be used.
  2. Riggers were not trained on the correct method of rigging in pumps.
  3. A procedure for rigging in the pump was already written, but it was buried in the notes section of the piping print.
  4. No audits had ever been conducted on rigging large pumps and valves into position.
  5. Supervisors were not available during the rigging.
  6. The personnel in the pump shop did not communicate effectively to the riggers.
  7. After the first failure, there was no process in place to determine the actual root cause. (In actuality, this incident was discovered by an independent supervisor working another job watching the riggers install the chain falls).

Corrective actions
This is another point in the incident investigation process that often fails. Corrective actions must now be assigned that are meaningful, achievable, and the results measurable. For example, it does no good to tell the workers to be more careful. Each of the root causes must be addressed on its own merit, with corrective actions assigned, carried through, and audited.

Best practice
Who has time for this type of analysis? In reality, all best-in-class companies have found the time. The time spent properly following up on equipment failures is rarely wasted time. In fact, the savings are compounded two-fold. In this particular case, the time spent conducting a proper equipment failure analysis would have saved the shipyard the three weeks and over $150,000 in delays after the first bearing failure. In addition, if the corrective actions are not implemented, this same issue is almost guaranteed to happen again, causing repeat equipment failures and delays further down the road.

Unfortunately, this scenario is not an isolated case. Every plant has at least one of these stories to tell. Not every plant can say it has come up with a proven system that has averted further repeat problems. As reflected in Table I, studies have shown that industry is not meeting the best practice mix of maintenance resource strategies:

Industry seems to be spending large sums of money on predictive maintenance systems, allowing users to know WHEN the gear is about to fail, but none of these systems can tell you WHY. It is up to the trained investigator, with the right tools, to be able to avoid the costly repeat failures that continue to plague the manufacturing field. MT

Kenneth Reed is a senior associate at System Improvements, Inc. in Knoxville, TN, and the program manager for the Equifactor® equipment troubleshooting module of the TaprooT® Root Cause Analysis system. E-mail:; Internet:

Continue Reading →


1:29 am
October 2, 2006
Print Friendly

7 Steps To Ensure Equipment Purchases Will Deliver

Better planning could help you get out of the “doing whatever it takes to make it work” mode. Wouldn’t that be nice?

It’s an all-too-familiar scenario.A month ago, your head engineer purchased and installed a new widget press for the plant that you maintain. It’s the best in the business, they claim, capable of forming 75 widgets per minute (wpm) with virtually no downtime or maintenance required. Hey, it says so right there in the big, glossy brochure. So, how come this press is only packing 50 wpm in the fifth week of commissioning, you’re spending a fortune to expedite replacement parts from Germany, and you’re keeping the vendor on speed-dial, demanding assembly drawings for this thing on a daily basis?

In the best of these cases, the project leader will work with you and the vendor to get this thing ramped up to deliver what was originally required. In far too many cases, though, the project leader is off to the next project or projects and simply does not have the time to help you get things going. They “sign off ” the project as delivered, and now the maintenance department has to scramble, revise and modify, chase vendors, make excuses, curse, and work the usual miracles to please the operations crowd. Honestly, how many of us have been in this situation too many times? And expect to be in this situation again? How can we make sure that for the next new equipment purchase, we will get what we ask for up front and avoid the bottomless pit of “doing whatever it takes to make it work?”

Typically, there are many parties involved in bringing a new manufacturing process on-line. Let’s look at a few broad groups, namely operations, maintenance, engineering and the vendors. A summary of their needs may look like Table I.

Now, in a well-run project, these parties will all come together to discuss the scope and deliverables of the project. At some point quotations will be requested, and the team may review proposals. Once a proposal is selected, the dialogue tends to shift to delivery and installation details. This is actually a very critical point in the project, at which the maintenance and operations leaders need to make sure that the project is really set up for success. That sounds easy, yet it’s not often done. How many of us have asked the question, “What can I do to make sure I’ll really get what I need?”

Toward a better way
The answer to the question is not simple. You can, however, significantly diminish this nightmare scenario through the use of seven basic steps. In this article we’ll look at the basic requirements for purchasing industrial equipment, from start to finish. These requirements are quite familiar to you, yet we’re going to approach them in a new way, so that we can create a set of guidelines.

These will be guidelines that you as a maintainer or project leader will be able to refer to for future projects.Once again, our goal here is to build a simple, straightforward template that can be adapted to new or rebuilt equipment purchases as the need arises.

1006_pitfalls_img1Our guiding principle, familiar to all maintainers, is to apply more preventive effort to reduce later corrective effort. There are diminishing returns to preventive work however, so we seek balance, as illustrated in Fig. 1.

You’ve heard of “uptime” and you certainly worry about “downtime.”So, to guide us through our seven steps, let’s try “UPP-TIME”. . .

  1. U – Understand the need: Clearly understand the required process capability.
  2. Prove the claims: Figure out how to measure that capability, and “score” the performance.
  3. P – Policies and standards: Determine what regulations must be met and then plan up front to meet them.
  4. T – Talk it through: Ensure thorough cross-functional reviews of the detailed design.
  5. I – make It win-win: Ensure that the vendor gets all of the information or assistance needed to succeed.
  6. M – Make it work: Be well equipped and organized for the installation, commissioning and handover.
  7. E – use the right Eyes: Involve maintainers and operators in design meetings, checkouts and acceptance tests.

The benefits to you and to your operations should be pretty evident at this point, and will become more so as we explore these steps in greater detail.

Step 1: Understand the need. . .
If you want to make sure that you get what you want, then make sure that you know what you want. Your whole team should agree upon what the end result will be. By drilling deep into your requirements, quantitative (process rate, capacities, etc.) and qualitative (materials, layout, etc.) descriptions can be developed. These can range from brief statements (i.e., must run at or above 100 F), to detailed descriptions (i.e., must maintain mean process temperature of 105 F with std. dev. no greater than 1.5 F, as measured in the center of the piece at four intermediate points in the oven). If you are dealing with an established supplier of equipment in your industry, from whom you have purchased before, perhaps your requirements can be limited to just a few key needs.

Step 2: Prove the claims. . .
Stating your performance needs is really only going to matter if you can truly measure them. This is the first of many times when your negotiating skills must be sharp. You have identified several needs that must be met, and the vendor will offer you equipment that he/she claims will meet those needs. Now you must figure out how to actually measure the quantitative and qualitative capabilities that you are about to purchase. This is how you will ultimately score the project, so be detailed. Typically this scoring is done while commissioning the equipment. If you can push the scoring forward, say to equipment checkouts or trials at the vendor’s factory, then that is even better. To help guide you through building the scorecard, first check to make sure that your objectives are SMART (Specific, Measurable, Achievable, Realistic and Time- Bound). Think through the Five Ws and make sure the team knows who is responsible for measuring what, how, under what conditions, and with what instruments. Final acceptance trials are typically done with the equipment installed and running in your operation, so remember that people will have other demands on their time. One option is to have final performance measures done by operations personnel in concert with the vendor, so that all parties can be assured that things are above-board.

Step 3: Policies and standards. . .
Don’t assume that the vendor will see to it that your machinery is built to meet all of the codes and regulations with which your company must comply. The project leader, as your company representative, has the responsibility to do that, as well as the obligation to assure compliance with your own corporate policies. If you haven’t already done so, it’s a good idea to build relationships with the local inspectors responsible for electrical safety, fire and building codes, regulated industries and materials, combustion or pressure vessels, etc. It’s also a good idea to get those inspectors involved early in the design stage. If the project manager is reluctant, then why not pick up the initial visits as a maintenance expense–consider it as an insurance payment! This is also the stage where you can make the case for maintainability. Actively seek out and push for those improvements that will minimize future maintenance and repair efforts. Again, put on the negotiator’s hat and get the vendor thinking about improvements that can be incorporated into future designs (and no doubt, proudly point out in his/her big glossy brochure).

Step 4: Talk it through. . .
Having a cross-functional team is more than a step in the process; it is an approach that helps throughout every other step. If you’re at the table with the project leader and no one from operations, quality, safety, sanitation, etc. is there, then you need to help get them there. These are your customers, so make sure they have input up front–you will save yourself grief later. This is quite important when automating older processes–you want things built, named and indicated so they make sense to the operators and maintainers who will live with it. The vendor’s young design engineers in a far-off office simply won’t know the technical language, slang, work habits and methods used in your facility

Step 5: Make It win-win. . . Now that the vendor has a good idea of what is needed, make sure that he/she gets all of the information or assistance that needed to succeed. Don’t withhold information for an “ego win” now that may hurt you later. Some of the things to consider:

  • Plant utilities that will be needed, in what quantities and from what sources
  • Piping and conduit/wiring runs
  • Current drawings to help ensure that the equipment is built to fit
  • Details of interlocks with existing production equipment and control/data acquisition systems
  • Means of getting the equipment into the facility and into place
  • Any other information that will help make sure that when things arrive at your facility, they fit right the first time

A strong project leader will pull all of this together, and a strong project team will guarantee that nothing major is missed. Remember, you need to help them in order to help yourself.

Step 6: Make it work. . .
Well, after all of this work up front, it’s time to dig deep and summon your boundless reserves of maintenance energy for the final push. Discuss the Five Ws again with the project leader and installation teams. Are you using the vendor, contractors and in-house people? How can you support the installation and commissioning? This is a great time to make sure things are done the way you want them, and also a golden opportunity for your maintainers to get their hands in right from the start.What a bonus for future troubleshooting! Not that you’ll have many problems since you’ve followed these seven steps all the way along.

Step 7: Use the right Eyes. . .
Again, this is not so much a distinct step as it is a way of doing everything. Let’s take cross-functional involvement a step further, and not just get employees from the plant floor in as seat warmers. Involve mechanics who can picture how they need to crawl under a machine to access a part, or operators whose average age and eyesight may dictate the design of the new touch screens. Definitely get them involved in checkouts and acceptance tests when you go out to the vendor’s facility to “kick the tires” prior to delivery. Experience shows that they tend to be more critical than the project leader may be in ensuring that the agreed-upon build standards are met.

(A note on managing people’s expectations: If your organization doesn’t typically involve hourly workers in projects, be careful about involving them late in the process, when the major decisions are already made and the scope is frozen. If people see that their involvement is superficial or symbolic, then the effort that they give and the support that you get will drop accordingly. Early and sincere involvement is the key!)

While there is so much more to executing successful projects, hopefully the seven steps of UPP-TIME will keep some of the key principles front and center during your next large equipment purchase. If there is any one overarching principle, it is to maintain focus on the set of capabilities that you require. Don’t let all of the details surrounding the new shiny equipment distract you or your project team from the fact that you are really buying an outcome for the business. This drives the equipment specifications first and foremost, and should guide all of the team’s decisions. You can take control and avoid the pitfalls that may have haunted past purchases. Good luck, and remember to think win-win! MT

Jerry Dover, P.Eng., is a plant maintenance manager for Canada’s Etobicoke Bakery. He has an extensive background in maintenance, food and beverage engineering and project management. Dover’s interest and long career in the maintenance and reliability field was launched at the age of 17, when he bought a nice red Mustang from his uncle for only $700. E-mail: DoverJE@Mapleleaf.Ca

Continue Reading →