Archive | 2007


7:39 pm
May 5, 2009
Print Friendly

Part I… Building Cultures Of Reliability-In-Action

Development of effective decision-making skills and behaviors is the foundation of human reliability. This human element is crucial to your equipment and process reliability.


Process-oriented organizations drive value by improving their business processes and equipment performance. At the same time, however, a number of applications, including asset management, work process improvement, defect elimination and preventive maintenance, among others, can be powerful but incomplete applications when seeking to sustain a competitive edge.

To implement and sustain high-performing, reliable cultures, managers need to be as rigorous about diagnosing, designing and implementing changes to the human decision-making process as they are with their business and equipment processes. Equipment and process reliability ultimately rest with human reliability. Thus, cultural change at its deepest level requires examining human reasoning and its resulting decisions.

To establish a culture-of-reliability requires going beyond the traditional stew of copycat approaches and learning how to: (1) use actionable tools to implement and sustain reliability improvements and bottom-line impact by (2) collecting cultural action data and (3) learning how to use that data to uncover hidden bottlenecks to performance.

In the quest for high performance, well-intentioned managers often launch cultural change efforts using what they believe to be applied methods, like employee surveys, team building, empowerment, leadership style, systems thinking, formal performance appraisal, 360° feedback, you name it, only to be disillusioned in the end by the fact that more change efforts fail than succeed. Although they may be well-accepted, traditional change methods are not precise enough to create and sustain cultures-of-reliability and typically evolve into the next flavor of the month.

The learning exercise
For the past 16 years I have been conducting a specific learning exercise related to cultural change. The purpose is to help participants understand why implementation is so hard. There are five objectives for the session:

  1. To discover root cause of implementation barriers;
  2. To illustrate the interdependent relationship between learning and error;
  3. To determine how participants personally feel when they make mistakes;
  4. Based on their experience of error, to understand how humans design a culture-in-action to avoid errors and mistakes; and
  5. To determine the costs of error avoidance to business and human dignity.

To start, participants construct a definition of competitive learning which, at its root, is defined as the detection and correction of mistakes, errors, variance, etc., at ever-increasing rates of speed and precision—the heart of reliability. Through poignant illustrations, they learn that their organizations tend to focus on making fast decisions (“time is money”), timelines, milestones etc., but at a cost to precision, the quality of the decision.

Based on that definition, the participants are asked to reflect on a recent performance mistake they have made on the job or in life. The response from hundreds of them—male and female, Fortune 500 executives, managers, supervisors, engineers, technicians and craftsmen—are very consistent. When they make an error they feel: shame, anger, frustration, stupid, embarrassed, inadequate with an impulse to hide the error and, at the same time, a desire to fix it. The result is an emotionally charged picture of wanting to fix mistakes coupled with an overwhelming response to hide them for fear of blame.

As the exercise unfolds, participants gain insight into how learning and mistakes, trial and error shape performance and how ineffective learning patterns persist for years. For example, individuals from process industries have revealed they’ve known that less-than-effective outages and turnarounds have existed for years; that “lessons-learned” sessions don’t successfully address operations and maintenance infighting and squabbles over what quality work means and the validity of data; that stalled work management initiatives or reprisals for management decisions are a fact of life; etc. The list goes on and on. Discovering why his division had not been able to penetrate a market for over 20 years, one vice president-level participant summed up the dilemma this way: “The costs [of ineffective learning] are so high, they are un-estimateable.”

Through collective reflection in a larger group, participants come to realize that they all experience learning in very similar ways. They also come to learn that their reasoning is very similar. They typically espouse that continuous learning is important and mistakes are OK, but, in the final analysis, mistakes are categorized as critical incidents on performance appraisals or simply seen as ineffectiveness.

When performance appraisal is tied to pay, rewards and promotion, participants indicate that they would have to be foolish, if they “didn’t put the best spin” and save face at any cost. “I have a mortgage to pay” is how many respondents put it. At the same time, they acknowledge learning does occur, but at a rate that leaves much to be desired. “It’s not all bad,” is how many participants put it. Yet, this is not really a case of being bad. Rather, it is a case of sincere, hard-working people unknowingly designing a culture with a set of unintended outcomes.

At this point, participants begin to gain insight: they say one thing and do another. Moreover, they come to understand that it is easy to see defensive patterns in others, but not so easy to see defensive patterns in themselves. Not surprising, being defensive is espoused as not ok. Hence, good team players should be open to feedback. Not being open would be admitting a mistake, the very essence of pain.

In the final phase of the learning exercise, participants come to recognize that they have a strong desire to learn and they seek noble goals, but that fears of retribution for telling the truth, blame, fear of letting someone down or fear of failure, whether in substance or perception, contribute to a sense of loss of control. Unfortunately, this situation violates the first commandment of management: BE IN CONTROL.

The need for control translates into a hidden performance bottleneck, given the complexity of job interdependencies and systemic error. As one individual noted, “I can’t control what I can’t control, but I am held accountable. Accountability translates into who to blame.” Participants acknowledge that they subtly side-step difficult issues and focus on the more routine, administrative issues, thereby reducing emotional pain and conflict in the short term. They acknowledge that they bypass the potential for higher performance by not reflecting on gaps in decision-making.

Ironically, as these decision bottlenecks limit performance, expectations for better performance increase, often resulting in unrealistic timelines and more stress. Executives complain they just don’t get enough change fast enough, and middle managers and individual contributors complain of “micro-management.” Sound familiar?

The end result is that sincere attempts to improve the status quo slowly are cocreatively undermined and inadequate budgets and unrealistic timeframes are set. Good soldiers publicly salute the goals, but privately resist because their years of experience have taught them to think in terms of “what’s the use of telling the truth as I see it; this, too, will pass.” Ultimately, many see the “other guy(s)” or group as the problem and wonder why we can’t “get them” in line. This is the heart of an organizational fad—something that often is labeled as the lack of accountability.

Based on participants’ data generated from this learning exercise and action data recorded and collected from the field (see Part III of this series for the data collection method), a culture-inaction model, similar to that shown in Fig. 1, is created and verified with illustrations. Participants consistently agree this type of model is accurate and reflects their own current cultures-in-action.

Underlying assumptions…
The culture-in-action model is rooted in human reasoning. Given the assumptions of avoiding mistakes and being in control to win and look competent in problem resolution, the reasoning path is clear. The behaviors make perfectly good sense.

When seeking solutions, multiple perspectives will proliferate on which solution is best, some with more risk, some with less. Think of it as inference stacking. A complex web of cause and effect, solutions and reasons why something will or will not work are precariously stacked one upon the other, up to a dizzying height.

Determining whose perspective is right is problematic (“Your guess is as good as mine”). Hence, controlling the agenda to reduce frustration either by withholding information (“Don’t even go there”) or aggressively manipulating people to submit or comply with someone else’s views to get things done is a logical conclusion based on the underlying assumptions.

It is not surprising that executives seek to control their organizations and focus on objectives—and when they do this that middle managers privately feel out of control because they think they are not trusted to implement initiatives or handle day-to-day routines. This leads to the following managerial dilemma: If I voice my real issues, I will not be seen as a good team player. If I stay silent, I will have to pretend to live up to unrealistic expectations. Either way is no win (a real double bind).

To overcome this dilemma, people verify and vent their emotions one-on-one, i.e. in hallways, restrooms and offices. This way, they avoid confronting the real issue of how they are impacted by others, which is diffi- cult to discuss in a public forum (“Don’t want to make a career-threatening statement”). Instead, they seek thirdparty validation that their beliefs are the right ones to hold (“Hey, John, can you believe what just happened in that meeting? I don’t think that strategy is going to work; didn’t we try it 10 years ago?”). Even the best-performing teams demonstrate some of these performance-reducing characteristics. The culture becomes laden with attributions about others’ motivation, intent and effectiveness and it is labeled “politics.”

Routine problems often are uncovered, organizations do learn, but the deeper performance bottlenecks, hidden costs, sources of conflict and high-performance opportunities are missed because the focus is on putting the “best spin” on “opportunities for improvement” with a twist of language to avoid the “mistake” word. That’s because mistakes are bad and people don’t like to discuss them. Interestingly enough, there are even objections to using the word “error” during the process of the exercise. It is not surprising that when trying to learn and continuously improve a turnaround, business process or project, for example, people privately will conclude “Oh, boy, here we go again. Another wasted meeting debating the same old issues.” Negative attributions proliferate (“They don’t want to learn”) and underlying tension grows.

At this stage of the process, the pattern begins to repeat itself. As the project effort falls behind, expectations build. Typically, someone will be expected to “step up” and be the hero. With eyes averted, looking down, uncomfortable silence, someone “steps up” and often gets rewarded. Yet this heroic reward doesn’t address root cause (i.e. what accounted for the errors and frustration in the first place). Side-stepping or avoiding the more difficult-to-discuss issues don’t help uncover root cause, but, rather, lead to fewer errors being discovered. As a result, the business goal is pushed a little further out and economic vulnerability is increased.

If the market is robust, errors and mistakes may mean little to a business. The demand can be high if you have the right product, at the right time. As competition increases, however, or the market begins to falter, the ability to remain competitive and achieve what the organization has targeted is crucial. Competitive learning is the only weapon an organization has to maintain its edge in the marketplace.

Major culture-in-action features
In summary, the major features of a true culture-in-action are:

  • Avoidance of mistakes and errors at all cost;
  • Little active inquiry to test negative attributions;
  • Little personal reflection (i.e. “How am I a part of the problem?”);
  • Little discussion of personal performance standards by which we judge others; and
  • Little agreement on what valid data would look like.

As the exercise winds down, it’s not long before someone asks, “So how do you get out of this status quo loop?” When this question comes, because it always does, I turn it back to the group and ask how they would alter this cultural system? The reaction is always the same—silence and stares. No wonder. The answer is not intuitively obvious, even to the most seasoned of practitioners and theorists.

The short answer is rather than “get” anyone anywhere, change has to be based on individual reflection and actionable tools driven through collaborative design and invitation. These actionable tools balance the playing field, at all levels, by helping create informed choice through daily decision-making reflection. Traditional intervention methods focus on changing behavior, learning your style or type, building a vision, etc. There are any number of approaches, all very powerful but incomplete without addressing the underlying reasoning (root cause) that is informing the behavior in the first place.

Coming next month
In Part II, a culture of reliability will be defined, as well as the role of reflection in organizational performance and the actionable tools of collaborative design. MT

Brian Becker is a senior project manager with Reliability Management Group (RMG), a Minneapolis-based consulting firm. With 27 years of business experience, he has been both a consultant and a manager. Becker holds a Harvard doctorate with a management focus. For more information, e-mail:

Continue Reading →


8:09 pm
April 29, 2009
Print Friendly

Going Wireless: Wireless Technology Is Ready For Industrial Use

Wireless works in a plant, but you’ll want to be careful regarding which “flavor” you choose

Wireless Technology now provides secure, reliable communication for remote field sites and applications where wires cannot be run for practical or economic reasons. For maintenance purposes, wireless can be used to acquire condition monitoring data from pumps and machines, effluent data from remote monitoring stations, or process data from an I/O system.

For example, a wireless system monitors a weather station and the flow of effluent leaving a chemical plant. The plant’s weather station is 1.5 miles from the main control room. It has a data logger that reads inputs from an anemometer to measure wind speed and direction, a temperature gauge and a humidity gauge. The data logger connects to a wireless remote radio frequency (RF) transmitter module, which broadcasts a 900MHz, frequency hopping spread spectrum (FHSS) signal via a YAGI directional antenna installed at the top of a tall boom located beside the weather station building. This posed no problem.

However, the effluent monitoring station was thought to be impossible to connect via wireless. Although the distance from this monitoring station to the control room is only one-quarter mile, the RF signal had to pass through a four-story boiler building. Nevertheless, the application was tested before installation, and it worked perfectly. The lesson here is that wireless works in places where you might think it can’t. All you have to do is test it.

There are many flavors of wireless, and an understanding is needed to determine the best solution for any particular application.Wireless can be licensed or unlicensed, Ethernet or serial interface, narrow band or spread spectrum, secure or open protocol,Wi-fi…the list goes on. This article provides an introduction to this powerful technology.

The radio spectrum
The range of approximately 9 kilohertz (kHz) to gigahertz (GHz) can be used to broadcast wireless communications. Frequencies higher than these are part of the infrared spectrum, light spectrum, X-rays, etc. Since the RF spectrum is a limited resource used by television, radio, cellular telephones and other wireless devices, the spectrum is allocated by government agencies that regulate what portion of the spectrum may be used for specific types of communication or broadcast.

In the United States, the Federal Communications Commission (FCC) governs the allocation of frequencies to non-government users. FCC has limited the use of Industrial, Scientific, and Medical (ISM) equipment to operate in the 902-928MHz, 2400-2483.5MHz and 5725-5875MHz bands,with limitations on signal strength, power, and other radio transmission parameters. These bands are known as unlicensed bands, and can be used freely within FCC guidelines. Other bands in the spectrum can be used with the grant of a license from the FCC. (Editor’s Note: For a quick definition of the various bands in the RF spectrum, as well as their uses, log on to: http://encyclopedia.thefreedictionary. com/radio+frequency )

Licensed or unlicensed
A license granted by the FCC is needed to operate in a licensed frequency. Ideally, these frequencies are interference-free, and legal recourse is available if there is interference. The drawbacks are a complicated and lengthy procedure in obtaining a license, not having the ability to purchase off-the-shelf radios since they must be manufactured per the licensed frequency, and, of course, the costs of obtaining and maintaining the license.


License-free implies the use of one of the frequencies the FCC has set aside for open use without needing to register or authorize them. Based on where the system will be located, there are limitations on the maximum transmission power. For example, in the U.S., in the 900MHz band, the maximum power may be 1 Watt or 4 Watts EIRP (Effective Isotropic Radiated Power).

The advantages of using unlicensed frequencies are clear: no cost, time or hassle in obtaining licenses; many manufacturers and suppliers who serve this market; and lower startup costs, because a license is not needed. The drawback lies in the idea that since these are unlicensed bands, they can be “crowded” and, therefore, may lead to interference and loss of transmission. That‘s where spread spectrum comes in. Spread spectrum radios deal with interference very effectively and perform well, even in the presence of RF noise.

Spread spectrum systems
Spread Spectrum is a method of spreading the RF signal across a wide band of frequencies at low power, versus concentrating the power in a single frequency as is done in narrowband channel transmission. Narrowband refers to a signal which occupies only a small section of the RF spectrum, whereas wideband or broadband signal occupies a larger section of the RF spectrum. The two most common forms of spread spectrum radio are frequency hopping spread spectrum (FHSS), and direct sequence spread spectrum (DSSS). Most unlicensed radios on the market are spread spectrum.

As the name implies, frequency hopping changes the frequency of the transmission at regular intervals of time. The advantage of frequency hopping is obvious: since the transmitter changes the frequency at which it is broadcasting the message so often, only a receiver programmed with the same algorithm would be able to listen and follow the message. The receiver must be set to the same pseudo-random hopping pattern, and listen for the sender’s message at precisely the correct time at the correct frequency. Fig. 1 shows how the frequency of the signal changes with time. Each frequency hop is equal in power and dwell time (the length of time to stay on one channel). Fig. 2 shows a two dimensional representation of frequency hopping, showing that the frequency of the radio changes for each period of time. The hop pattern is based on a pseudo random sequence.


DSSS combines the data signal with a higher data-rate bit-sequence-also known as a ‘chipping code’-thereby “spreading” the signal over greater bandwidth. In other words, the signal is multiplied by a noise signal generated through a pseudo-random sequence of 1 and -1 bits. The receiver then multiplies the signal by the same noise to arrive at the original message (since 1 x 1 = 1 and -1 x -1 = 1).

When the signal is “spread,” the transmission power of the original narrowband signal is distributed over the wider bandwidth, thereby decreasing the power at any one particular frequency (also referred to as low power density). Fig. 3 shows the signal over a narrow part of the RF spectrum. In Fig. 4, that signal has been spread over a larger part of the spectrum, keeping the overall energy the same, but decreasing the energy per frequency. Since spreading the signal reduces the power in any one part of the spectrum, the signal can appear as noise. The receiver must recognize this signal and demodulate it to arrive at the original signal without the added chipping code. FHSS and DSSS both have their place in industry and can both be the “better” technology based on the application. Rather than debating which is better, it is more important to understand the differences, and then select the best fit for the application. In general, a decision involves:

  • Throughput
  • Colocation
  • Interference
  • Distance
  • Security

Throughput is the average amount of data communicated in the system every second. This is probably the first decision factor in most cases. DSSS has a much higher throughput than FHSS because of a much more efficient use of its bandwidth and employing a much larger section of the bandwidth for each transmission. In most industrial remote I/O applications, the throughput of FHSS is not a problem.

As the size of the network changes or the data rate increases, this may become a greater consideration. Most FHSS radios offer a throughput of 50-115 kbps for Ethernet radios.Most DSSS radios offer a throughput of 1-10 Mbps. Although DSSS radios have a higher throughput than FHSS radios, one would be hard pressed to find any DSSS radios that serve the security and distance needs of the industrial process control and SCADA market. Unlike FHSS radios, which operate over 26MHz of the spectrum in the 900MHz band (902-928MHz), and DSSS radios, which operate over 22MHz of the 2.4GHz band, licensed narrow band radios are limited to 12.5kHz of the spectrum.Naturally, as the width of the spectrum is limited, the bandwidth and throughput will be limited as well.Most licensed frequency narrowband radios offer a throughput of 6400 to 19200 bps.

Collocation refers to having multiple independent RF systems located in the same vicinity. DSSS does not allow for a high number of radio networks to operate in close proximity as they are spreading the signal across the same range of frequencies. For example, within the 2.4GHz ISM band, DSSS allows only three collocated channels. Each DSSS transmission is spread over 22MHz of the spectrum, which allows only three sets of radios to operate without overlapping frequencies.

FHSS, on the other hand, allows for multiple networks to use the same band because of different hopping patterns. Hopping patterns which use different frequencies at different times over the same bandwidth are called orthogonal patterns. FHSS uses orthogonal hopping routines to have multiple radio networks in the same vicinity without causing interference with each other. That is a huge plus when designing large networks, and needing to separate one communication network from another. Many lab studies show that up to 15 FHSS networks may be collocated, whereas only 3 DSSS networks may be collocated. Narrowband radios obviously cannot be collocated as they operate on the same 12.5MHz of the spectrum.

Interference is RF noise in the vicinity and in the same part of the RF spectrum. A combining of the two signals can generate a new RF wave or can cause losses or cancellation in the intended signal. Spread Spectrum in general is known to tolerate interference very well, although there is a difference in how the different flavors handle it.When a DSSS goingwireless4receiver finds narrowband signal interference, it multiplies the received signal by the chipping code to retrieve the original message. This causes the original signal to appear as a strong narrow band; the interference gets spread as a low power wideband signal and appears as noise, and thus can be ignored.

In essence, the very thing that makes DSSS radios spread the signal to below the noise floor is the same thing that allows DSSS radios to ignore narrowband interference when demodulating a signal. Therefore, DSSS is known to tolerate interference very well, but it is prone to fail when the interference is at a higher total transmission power, and the demodulation effect does not drop the interfering signal below the power level of the original signal.

Given that FHSS operates over 83.5MHz of the spectrum in the 2.4GHz band, producing high power signals at particular frequencies (equivalent to having many short synchronized bursts of narrowband signal) it will avoid interference as long as it is not on the same frequency as the narrowband interferer.Narrowband interference will, at most, block a few hops which the system can compensate for by moving the message to a different frequency. Also, the FCC rules require a minimum separation of frequency in consecutive hops, and therefore the chance of a narrowband signal interfering in consecutive hops is minimized.

When it comes to wideband interference, DSSS is not so robust. Since DSSS spreads its signal out over 22MHz of the spectrum all at once at a much lower power, if that 22MHz of the spectrum is blocked by noise or a higher power signal, it can block 100% of the DSSS transmission, although it will only block 25% of the FHSS transmission. In this scenario, FHSS will lose some efficiency, but not be a total loss.

In licensed radios the bandwidth is narrow, so a slight interference in the range can completely jam transmission. In this case, highly directional antennas and band pass filters may be used to allow for uninterrupted communication, or legal action may be pursued against the interferer.

802.11 radios are more prone to interference since there are so many readily available devices in this band. Ever notice how your microwave interferes with your cordless phone at home? They both operate in the 2.4GHz range, the same as the rest of 802.11 devices. Security becomes a greater concern with these radios.

If the intended receiver of a transmitter is located closer to other transmitters and farther from its own partner, it is known as a Near/Far problem. The nearby transmitters can potentially drown the receiver in foreign signals with high power levels. Most DSSS systems would fail completely in this scenario. The same scenario in a FHSS system would cause some hops to be blocked but would maintain the integrity of the system. In a licensed radio system, it would depend on the frequency of the foreign signals. If they were on the same or close frequency, it would drown the intended signal, but there would be recourse for action against the offender unless they have a license as well.

Distance is closely related to link connectivity, or the strength of an RF link between a transmitter and a receiver, and at what distance they can maintain a robust link. Given that the power level is the same, and the modulation technique is the same, a 900MHz radio will have higher link connectivity than a 2.4GHz radio. As the frequency in the RF spectrum increases, the transmission distance decreases if all other factors remain the same. The ability to penetrate walls and object also decreases as the frequency increases.Higher frequencies in the spectrum tend to display reflective properties. For example, a 2.4GHz RF wave can bounce off reflective walls of buildings and tunnels. Based on the application, this can be used as an advantage to take the signal farther, or it may be a disadvantage causing multipath, or no path, because the signal is bouncing back.

FCC limits the output power on spread spectrum radios. DSSS consistently transmits at a low power, as discussed above, and stays within the FCC regulation by doing so. This limits the distance of transmission for DSSS radios, and thus this may be a limitation for many of the industrial applications. FHSS radios, on the other hand, transmit at high power on particular frequencies within the hopping sequence, but the average power on the spectrum is low, and therefore can meet with the regulations. Since the actual signal is transmitting at a much higher power than the DSSS, it can travel further.Most FHSS radios are capable of transmitting over 15 miles, and longer distances with higher gain antennas.

802.11 radios, although available in both DSSS as well as FHSS, have a high bandwidth and data rate, up to 54Mbps (at the time of this publication). But it is important to note that this throughput is for very short distances, and downgrades very quickly as the distance between the radio modems increases. For example, a distance of 300 feet would drop the 54Mbps rate down to 2Mbps. This makes this radio ideal for a small office or home application, but not for many industrial applications where there is a need to transmit data over several miles.

Since narrowband radios tend to be a lower frequency, they are a good choice in applications where FHSS radios cannot provide adequate distance. A proper application for narrow band licensed radios is when there is a need to use a lower frequency to either travel over a greater distance, or be able to follow the curvature of the earth more closely and provide link connectivity in areas where line of sight is hard to achieve.

Since DSSS signals run at such low power, the signals are difficult to detect by intruders. One strong feature of DSSS is its ability to decrease the energy in the signal by spreading the energy of the original narrowband signal over a larger bandwidth, thereby decreasing the power spectral density. In essence, this can bring the signal level below the noise floor, thereby making the signal “invisible” to would-be intruders. On the same note, however, if the chipping code is known or is very short, then it is much easier to detect the DSSS transmission and retrieve the signal since it has a limited number of carrier frequencies. Many DSSS systems offer encryption as a security feature, although this increases the cost of the system and lowers the performance, because of the processing power and transmission overhead for encoding the message.

For an intruder to successfully tune into a FHSS system, he needs to know the frequencies used, the hopping sequence, the dwell time and any included encryption. Given that for the 2.4GHz band the maximum dwell time is 400ms over 75 channels, it is almost impossible to detect and follow a FHSS signal if the receiver is not configured with the same hopping sequence, etc. In addition, most FHSS systems today come with high security features such as dynamic key encryption and CRC error bit checking.

Today,Wireless Local Area Networks (WLAN) are becoming increasingly popular. Many of these networks use the 802.11 standard, an open protocol developed by IEEE.Wi-fiis a standard logo used by the Wireless Ethernet Compatibility Alliance (WECA) to certify 802.11 products. Although industrial FHSS radios tend to not be Wi-fi, and therefore not compatible with these WLANs, there may be a good chance for interference due to them operating in the same bandwidth. Since most Wi-fiproducts operate in the 2.4 or 5GHz bands, it may be a good idea to stick with a 900MHz radio in industrial applications, if the governing body allows this range (Europe allows only 2.4GHz, not 900MHz). This will also provide an added security measure against RF sniffers (a tool used by hackers) in the more popular 2.4 band.

Security is one of the top issues discussed in the wireless technology sector. Recent articles about “drive-by hackers” have left present and potential consumers of wireless technology wary of possible infiltrations. Consumers must understand that 802.11 standards are open standards and can be easier to hack than many of the industrial proprietary radio systems.

The confusion about security stems from a lack of understanding of the different types of wireless technology. Today, Wi-fi(802.11a, b, and g) seems to be the technology of choice for many applications in the IT world, homes and small offices. 802.11 is an open standard in which many vendors, customers and hackers have access to the standard.While many of these systems have the ability to use encryption like AES and WEP, many users forget or neglect to enable these safeguards which would make their systems more secure.Moreover, features like MAC filtering can also be used to prevent unauthorized access by intruders on the network. Nonetheless, many industrial end users are very wary about sending industrial control information over standards that are totally “open.”

So, how do users of wireless technology protect themselves from infiltrators? One almost certain way is to use non- 802.11 devices that employ proprietary protocols that protect networks from intruders. Frequency hopping spread spectrum radios have an inherent security feature built into them. First, only the radios on the network that are programmed with the “hop pattern” algorithm can see the data. Second, the proprietary, non-standard, encryption method of the closed radio system will further prevent any intruder from being able to decipher that data.

The idea that a licensed frequency network is more secure may be misleading. As long as the frequency is known, anyone can dial into the frequency, and as long as they can hack into the password and encryption, they are in. The added security benefits that were available in spread spectrum are gone since licensed frequencies operate in narrowband. Frequency hopping spread spectrum is by far the safest, most secure form of wireless technology available today.

Mesh radio networks
Mesh radio is based on the concept of every radio in a network having peer-topeer capability. Mesh networking is becoming popular since its communication path has the ability to be quite dynamic. Like the worldwide Web, mesh nodes make and monitor multiple paths to the same destination to ensure that there is always a backup communication path for the data packets.

There are many concerns that developers of mesh technology are still trying to address, such as latency and throughput. The concept of mesh is not new. The internet and phone service are excellent mesh networks based in a wired world. Each node can initiate communication with another node and exchange information.

In conclusion, the choice of radio technology to use should be based on the needs of the application. For most industrial process control applications, proprietary protocol license-free frequency hopping spread spectrum radios (Fig. 5) are the best choice because of lower cost and higher security capabilities in comparison to licensed radios.When distances are too great for a strong link between FHSS radios with repeaters, then licensed narrowband radios should be considered for better link connectivity. The cost of licensing may offset the cost of installing extra repeaters in a FHSS system.

As more more industrial applications require greater throughput, networks employing DSSS that enable TCP/IP and other open Ethernet packets to pass at higher data rates will be implemented. This is a very good solution where PLCs (Programmable Logic Controllers), DCS (Distributed Control Systems) and PCS (Process Control Systems) need to share large amounts of data with one another or upper level systems like MES (Manufacturing Execution Systems) and ERP (Enterprise Resource Planning) systems.

When considering a wireless installation, check with a company offering site surveys that allow you to install radios at remote locations to test connectivity and throughput capability. Often this is the only way to ensure that the proposed network architecture will satisfy your application requirements. These demo radios also let you look at the noise floor of the plant area, signal strength, packet success rate and the ability to identify if there are any segments of the license free bandwidth that are currently too crowded for effective communication throughput. If this is the case, then hop patterns can be programmed that jump around that noisy area instead of through it. MT

Gary Mathur is an applications engineer with Moore Industries-International, in North Hills, CA. He holds Bachelor’s and Masters degrees in Electronics Engineering from Agra University, and worked for 12 years with Emerson Process Management before joining Moore. For more information on the products referenced in this article, telephone: (818) 894-7111; e-mail:

Continue Reading →


6:00 am
December 1, 2007
Print Friendly

Why Some Root-Cause Investigations Don't Prevent Recurrence

It doesn’t matter what type of industry you’re in, if failure isn’t an option at your plant, you’ll want to understand why these investigations sometimes fail their mission.

In the nuclear power industry, the primary mission of a root-cause investigation is to understand how and why a failure or a condition adverse to quality has occurred so that it can be prevented from recurring. This is a good practice for many reasons—and a lawful requirement mandated by 10CFR50, Appendix B, Criterion XVI.

To successfully carry out this mission, a root-cause investigation needs to be evidence-driven in accordance with a rigorous application of the bedrock of all root-cause methodologies: the Scientific Method. Consistent with the Scientific Method, underlying assumptions have to be questioned and conclusions have to be consistent with the available evidence, as well as with proven scientific facts and principles.

Sometimes root-cause investigations fail to fulfill their primary mission and the failure recurs. In that regard, diagnosing the root cause of root-cause investigation failures is, in itself, an interesting topic. Here are three common reasons why some root-cause investigations fail their mission.

Reason #1: The Tail Wagging the Dog
As a root-cause investigation proceeds and information about the failure event accumulates, some initial hypotheses can be readily falsified by the preliminary evidence and dismissed from consideration. The diminished pool of remaining hypotheses will likely have some attributes in common. More work is then usually needed to uncover additional evidence to discriminate which of the remaining hypotheses specifically apply.

At this point in the investigation, it may become apparent what the final root cause might be—especially if the remaining pool of hypotheses is small and they all share several important attributes. At the same time, it also becomes apparent what the corresponding corrective actions might be.

By anticipating which corrective actions are more palatable to the client or management, the investigator may begin to unconsciously—or perhaps even consciously—steer the remainder of the investigation to arrive at a root cause whose corresponding corrective actions are less troublesome.

Evidence that appears to support the root cause and lead to more palatable corrective actions is actively sought, while evidence that might falsify the favored root cause is not actively sought. Evidence that could falsify a favored root cause may be dismissed as being irrelevant or not needed. It may be tacitly assumed to not exist, to have disappeared or to be too hard or too expensive to find. It may even just be ignored because so much evidence already exists to support the favored root cause that the investigator presumes he already has the answer.

In logic, this is defined as an a priori methodology. This is where an outcome or conclusion is decided beforehand, and the subsequent investigation is conducted to find support for the foregone conclusion. In this case, the investigator has decided what corrective actions he wants based on convenience to his client or management. Subsequently, he uses the remainder of the investigation to seek evidence that points to a root-cause that corresponds to the corrective actions he desires.


What Really Happened: Failure Of A Zener Diode

This X-ray radiograph shows a 1N752A-type Zener diode that was manufactured without a die-attach at one end of the die, and with only marginal die-attach at the other end. This die-attach defi ciency caused the component to fail unexpectedly in an intermittent fashion. In turn, this led to a failure in the voltage regulator system of an emergency diesel generator system, causing it to be temporarily taken out of service.

The failure of this Zener diode occurred in a circuit board that had seen less than 40 hours of actual service time, although the circuit board itself was over 27 years old. It had been a spare board kept in inventory.

Going to this level of detail to gather evidence might seem extreme. This particular evidence, however, was fundamental to validating the hypothesis that the rootcause in this case was a random failure due to a manufacturing defect, and falsifying the hypothesis that the failure was caused by an infant mortality type failure. In the nuclear power industry, this distinction is significant.

Here is an example: A close-call accident involved overturning a large, heavy, lead-lined box mounted on a relatively tall, small-wheeled cart. The root-cause investigation team found that the box and wheeled cart combination was intrinsically unstable. The top-heavy cart easily tipped when the cart was moved and the front wheels had to swivel, or when the cart was rolled over a carpet edge or floor expansion joint.

The investigation team also found that the personnel who moved the cart in the course of doing cleaning work in the area had done so in violation of an obviously posted sign. The sign stated that prior to moving the cart a supervisor was to be contacted. The personnel, however, inadvertently moved the cart—without contacting a supervisor—in order to clean under and around it.

The easy corrective actions in this case would be to chastise the personnel for not following the posted rules and to strengthen work rule adherence through training and administrative permissions. There is ample evidence to back-fit a root cause to support these actions. Also, such a root-cause finding—and its corresponding corrective actions—are consistent with what everyone else in the industry has done to address the problem, as noted in ample operational experience reports. In the nuclear power industry, the “bandwagon” effect of doing what other plants are doing is very strong.

In short, the aforementioned corrective actions are attractive because they appeal to notions of personal accountability, are cheap to do and can quickly dispose of the problem. Consequently, the root cause of the close-call accident was that the workers failed to follow the rules.

Unfortunately, when the cart and box combination is rolled to a new location, the same problem could recur. The procedure change and additional training might not have fixed the instability problem. While the new administrative permissions and additional training could reduce the probability of recurrence, they would not necessarily eliminate it. When the cart is rolled many times to new locations, it is probable that the problem will eventually recur and perhaps cause a significant injury. This situation is similar to the hockey analogy of “shots on goal.” Even the best goalkeeper can be scored upon if there are enough shots on goal.

Reason #2: Putting Lipstick on a Corpse
In this instance, a failure event has already been successfully investigated. A root cause supported by ample evidence has been determined. Vigorous attempts to falsify the root-cause conclusion have failed. Ok…so far, so good.

On the other hand, perhaps the root-cause conclusion is related to a deficiency involving a friend of the investigator, a manager known to be vindictive and sensitive to criticism or some company entity that, because of previous problems, can’t bear criticism. The latter could include an individual that might get fired if he is found to have caused the problem, an organization that might be fined or sued for violating a regulation or law or a department that might be re-organized or eliminated for repeatedly causing problems. In other words, the root-cause investigator is aware that the actual consequences of identifying and documenting the root cause may be greater than just the corrective actions themselves.

When faced with this dilemma, some investigators attempt to “word-smith” the root-cause report in an eff ort to minimize perceived negative findings and to emphasize perceived positive findings. Instead of using plain, factually descriptive language to describe what occurred, less precise and more positive- sounding language is used. This is called “word-smithing” a report.

“Word-smithed” reports are relatively easy to spot. Instead of using plain modifiers like “deficient” or “inadequate” to describe a process, euphemistic phrases like “less than sufficient” or “less than adequate” are used. Instead of reporting that a component has failed a surveillance test, the component is reported to have “met 95% of its expected goals.” Likewise, instead of reporting that a fire occurred, it is reported that there was a “minor oxidation-reduction reaction that was temporarily unsupervised.”

In such cases, the root-cause report becomes a quasi-public relations document that sometimes has conflicting purposes. Since it is a root-cause report, its primary purpose is supposed to be a no-nonsense, fact-based document that details what went wrong and how to fix it. However, a secondary, perhaps conflicting, purpose is introduced when the same document is used to convince the reader that the failure event and its root cause are not nearly as significant or serious as the reader might otherwise think.

With respect to recurrence, there are two problems with “word-smithing” a root-cause report. Corrective actions work best when they are specific and targeted. A diluted or minimized root-cause, however, is oft en matched to a diluted or minimized corrective action. There is a strong analogy to the practice of medicine in this instance. When a person has an infection, if the degree of infection is underestimated, the medicine dose may be insufficient and the infection may come back.

The second problem is that by putting a positive “spin” on the problem, management may not properly support what needs to be done to fix the problem. Thus, the report succeeds in convincing its audience that the failure event is not a serious problem.

Reason #3: Elementary My Dear Watson
In some ways, root-cause investigations are a lot like “whodunit” novels. Some plant personnel simply can’t resist making a guess about what caused the failure in the same way that mystery buffs often try to second guess who will be revealed to be the murderer at the end of the story. It certainly is fun for a person—and perhaps even a point of pride—if his/her initial guess turns out to be right. Unfortunately, there are circumstances when such a guess can jeopardize the integrity of a root-cause investigation.

The circumstances are as follows:

  • The guess is made by a senior manager involved in the root-cause process.
  • The plant has an authoritarian, chain-of-command style organization.
  • The management culture puts a high premium on being “right,” and has a zero-defects attitude about being “wrong.” the scenario goes something like this:
  • A failure event occurs or a condition adverse to quality is discovered.
  • Some preliminary data is quickly gathered about conditions in the plant when the failure occurred.
  • From this preliminary data, a senior manager guesses that the root-cause will likely be x, because:
    • (1) he/she was once at a plant where the same thing occurred; or
    • (2) applying his/her own engineering acumen, he/she deduces the nature of the failure from the preliminary data, like a Sherlock Holmes or a Miss Marple.
  • Not being particularly eager to prove their senior manager wrong and deal with the consequences, the root-cause team looks for information that supports the manager’s hypothesis.
  • Not surprisingly, the teams find some of this supporting information; the presumption is then made that the cause has been found and field work ceases.
  • A report is prepared, submitted and approved, possibly by the same senior manager that made the Sherlockian guess.
  • The senior manager takes a bow, once again proving why he is a senior manager.

The deficiency in this scenario that can lead to recurrence is the fact that falsification of the favored hypothesis was not pursued. Once a cause was presumed to have been found, significant evidence gathering ceased. (Why waste resources when we already have the answer?) As a result, evidence that may have falsified the hypothesis, or perhaps supported an alternate hypothesis, was left in the field. Again, this is another example of an a priori methodology: where the de facto purpose of the investigation is to gather information that supports the favored hypothesis.

In this regard, there is a famous experiment about directed observation that applies. Test subjects in the experiment were told to watch a volleyball game carefully because they would be questioned about how many times the volleyballs would be tipped into to air by the participants. This they did.

In fact, the test subjects did this so well, they ignored a person dressed in a gorilla suit who sauntered through the gaggle of volleyball players as they played. When the test subjects were asked about what they had observed, they all reported dutifully the number of times the ball was tipped but no one mentioned the gorilla. When they were told about the gorilla, they were incredulous and did not believe that they had missed seeing a gorilla…until they were shown the tape a second time. At that point, they all observed the gorilla. MT

Randall Noon is currently a root-cause team leader at Cooper Nuclear Station. A licensed professional engineer in both the United States and Canada, he has been investigating failures for 30 years. Noon is the author of several articles and texts on failure analysis, including the Engineering Analysis of Fires and Explosions and Forensic Engineering Investigations. He also has contributed two chapters to the popular college text, Forensic Science, edited by James and Nordby. E-mail:

Continue Reading →


6:00 am
December 1, 2007
Print Friendly

Polishing A Contracted Maintenance Strategy

Maintenance was never a core competency for this Swedish manufacturer. Now working with an outside service provider, the company truly understands the meaning of “win-win.”

Stainless steel is the fastest growing metal market in the world, not only for its popularity in kitchen appliances but industrial applications as well. The Outokumpu Stainless Hot Rolled Plate (HRP) factory in Degerfors, Sweden serves the latter.

Outokumpu Stainless is one of the world’s four largest producers of hot rolled plate with one of the widest range of products and steel grades within the stainless steel industry. Our Degerfors factory alone produces 120 thousand tons per year. The plates are extremely resistant to corrosion and wear, making them popular in challenging applications and environments including pulp & paper, oil & gas, chemicals and power generation.

1207_polishing1Because our customers depend on us to keep our production lines running, we looked outside the company for maintenance assistance. Gradually, we increased our reliance on contracted maintenance services (“outsourcing”) and raised the bar to higher standards. The strategy has led to our current full-service, performance-oriented, maintenance- management agreement.

Outsourcing evolution
When the plant opened in 1996, we had extensive knowledge of stainless steel production, but little in terms of equipment maintenance. To alleviate the burden, some maintenance tasks were managed internally and others were contracted out on an hourly basis to various service providers. At its peak, about 100 individuals were involved in plant maintenance activities.

For three years, our operational effectiveness (OE) and production availability were high, yet our maintenance costs were prohibitive. The break/fix approach was expensive, and tensions ran high between maintenance and production personnel.

1207_polishing_2By 1999, maintenance was still not a core competency for us. Thus, we resolved to forgo all maintenance responsibility and consolidate it under a single, more conducive contract. We chose to contract 100% of our corrective and preventive maintenance activities in Degerfors under a jointly developed, hourly-based ABB Full Service maintenance agreement.

The agreement established performance objectives that subjected the service provider to bonuses or penalties depending on its performance. This approach allowed the contractor to share the risks and rewards of plant maintenance, and provided the incentive to continuously improve performance. Soon, we had approximately 65 ABB Service employees working at the plant.

In 2001, the arrangement was transitioned from hourly rates to a fixed price so that we could have more predictable budgets. Performance incentives still provided rewards or penalties depending on the results achieved.

1207_polishing_3By 2006, an enhanced four-year contract was negotiated. Plant management, production and maintenance personnel were all involved in developing the new agreement, setting target performance levels and specifying when and how long the machines would be stopped should corrective maintenance be required. More services were added to the agreement and caps were established on certain service costs.

We began conducting weekly management meetings with the provider to assess equipment status, production schedules and maintenance priorities. In our plant, production is moving all the time and production priorities change every week. When corrective action is required, maintenance personnel are reassigned to the highest priority tasks based on equipment criticality and bottleneck location. Our priority classifications are as follows:

  • Level One – Accident risk: Equipment problems that pose a potential danger for the operator are of first concern. All other maintenance is stopped.
  • Level Two – Outage in the hot part of production: Equipment trouble in the hot rolling mill can destroy a lot of materials and suffer the greatest costs.
  • Level Three – Process transition: Bottlenecks in moving from one machine to another affect production throughput and must be minimized.

Operational benefits
One of the greatest advantages of our maintenance outsourcing agreement is having another company at the table. It provides a new way of thinking about maintenance and a new perspective on problems. We can be experts at producing plates, while our contracted service provider can focus on keeping our machines running. Moreover, we can put much greater pressure on an outside party than we would on our own employees.

When the Maximo system was brought in, we saw a tremendous improvement. Our previous maintenance management system was wholly inadequate, and work instructions were often written on paper. Now, all of ABB’s maintenance practices and records are tracked in the new system. Outokumpu also uses the system to manage spare parts.

Our costs have decreased as a result of streamlined operations and better maintenance planning, giving us the ability to do more with less. Maintenance costs now are on par with other departments, while OE and production availability remain high.

The four-year agreement duration also has given our service provider greater incentive to invest more in its maintenance processes, since it now can be assured of seeing the return on its investments before the contract expires.

Convincing results
Among other things, since 2001, our full service maintenance agreement has helped us:

  • Decrease our total maintenance cost by 24%
  • Reduce our maintenance cost per produced ton by 58%
  • Achieve our current customer satisfaction score of 91.2%

What’s most impressive is that, in the same timeframe, we’ve raised our production volume by 80%—to 120 thousand tons. In 2006, as part of our agreement, we added overall equipment effectiveness (OEE) as an additional metric. Much more preventive work is being done now, and the work is being completed more quickly and efficiently.

Ongoing improvement
The performance incentives in the full-service agreement benefit Outokumpu through ongoing operational improvements and the service provider through financial rewards. As such, we are always trying to do things better. Utilizing the industry’s best maintenance practices and systems will facilitate our mutual desire for continuous improvement.

Our greatest test was convincing the corporate office of our strategy’s value. Because Outokumpu’s vision is to be number one in stainless, with success based on operational effectiveness, management questioned whether maintenance outsourcing fit with our corporate goals. Once we explained the arrangement, including the benchmarking, the best practices and the bottom-line benefits, management supported our approach. By entrusting an outside service provider with all our maintenance requirements under the full-service, performance-driven agreement, Outokumpu corporate and the Degerfors plant can look forward to further cost reductions and operational improvements. MT

Mladen Perkovic is production manager for the Outokumpu Stainless Hot Rolled Plate (HRP) Plant.

About ABB Full Service

After years of downsizing and emphasizing core competencies, manufacturers can no longer rely solely on internal staff to meet the demands of designing, implementing, maintaining and optimizing their manufacturing infrastructure. Innovative partnerships that emphasize shared risk, common objectives, and business benefits tied to operating results are emerging to redefine supplier/client relationships.

An ABB Full Service® partnership is a long-term, performance-based agreement in which ABB commits to maintain and improve the production equipment. With a Full Service agreement, ABB takes over responsibility for the engineering, planning, execution and management of an entire plant’s maintenance activities.

Bringing together world-class maintenance and reliability methodologies, parts and logistics management, online tools, and domain expertise, ABB Full Service increases asset effectiveness while keeping tight control of costs.

Each contract is measured against Key Performance Indicators (KPIs) developed with the client. To demonstrate its commitment to the client’s success, ABB includes risk/reward sharing in its Full Service contracts, linking ABB’s financial outcome directly to the client’s performance.


  • Improve plant performance
  • Increase reliability and life cycle of production equipment
  • Manage maintenance as a business
  • Manage change and create a service culture
  • Access to resources and knowledge of ABB’s global network

Continue Reading →


6:00 am
December 1, 2007
Print Friendly

Executive Perspective: Thank You!


Arthur L. Rice, President

That’s right. I want to thank our loyal readers, contributors and partners for a great run. This issue marks the end of Maintenance Technology’s special year-long 20th Anniversary Celebration. It also marks the beginning of our next 20 successful years of publishing. Projecting our future (and also being a grandfather), I think the words of Buzz Lightyear sum it up best: “To infinity, and beyond…”

Maintenance Technology was founded 20 years ago by a dedicated team of individuals who saw a need to serve maintenance practitioners by promoting Best Practices throughout industry. For the past two decades, that’s exactly what we’ve been doing—delivering the best-read, most-preferred, monthly, independent and audited publication in the market to ever-savvier, increasingly hard-working maintenance and reliability professionals across virtually all industry sectors. Supported by practitioners, industry experts and suppliers who are willing to share their knowledge, skills, experience and technologies/methodologies with you, this powerful, high-quality editorial is now—and always will be—designed to help our readers successfully meet their capacity assurance needs.

Although many things have changed over the past 20 years, Maintenance Technology has stayed the course, never deviating from our primary mission and strategies. We serve our readers. We engage our readers. We listen to our readers. Doing so has led us to grow in some unexpected and exciting ways.

Five years ago, we developed and began presenting Maintenance & Reliability Technology Summit (MARTS) an annual professional development program that has become one of the premier learning and networking events for the maintenance and reliability community. In 2004, we began publishing another standalone magazine, now known as Lubrication Management & Technology, dedicated to improving industrial lubrication programs. More recently, we have begun producing regular quarterly supplements like Utilities Manager and The Fundamentals, focusing, respectively, on energy efficiency and a backto- basics approach to maintenance and reliability. These are just a few of the many things that have helped Maintenance Technology maintain its position as the leading publication in our market. Along with other yet-to-be-determined offerings, they will be among the things that help us grow and better serve you and future generations of maintenance and reliability professionals over the next 20 years.

Because we could not have gotten where we are today without the help of many individuals and organizations, we put a lot of stock in giving something back “to the good of the order.” For example, while building Maintenance Technology into the publication that it is today, we were one of the founding entities of the Society for Maintenance and Reliability Practitioners (SMRP). We also continue to be strongly involved in industry activities such as MER (the Maintenance Excellence Roundtable), NAME/FIME (the North American Maintenance Excellence Award), STLE, ARC, MIMOSA and FSA (the Fluid Sealing Association), among others. We view our participation in these diverse types of initiatives as something that truly helps set a reader-driven publication such as Maintenance Technology ahead of the pack—and that’s a place we always want to be!

It’s been a tremendous 20 years. All of those involved with Maintenance Technology, including past and present staff, contributors, associations, valued advertising partners and you—our loyal readers—deserve my heartfelt appreciation. Again, thank you all! MT

Continue Reading →


6:00 am
December 1, 2007
Print Friendly

Maintenance Quarterly: Cleaning Up A Maintenance Nightmare

A hydropulper is an industrial blender used in the pulp and paper industry to process fibrous materials into a useable slurry. As shown in Fig. 1, the main parts include: a vessel, a lower chamber containing an agitator or impeller, a rotary drive, motor, gearbox, a tube to re-circulate the slurry and some type of sealing system work to prevent water and other kinds of contamination from damaging the equipment.

In simplest terms, a hydropulper’s tanks are filled with water where agitators mix material into homogenous slurry. Sensors gauge the slurry consistency and make adjustments by adjusting the water to thin or thicken the mix. A rotor or agitator inside the chamber vigorously pulps the fiber while an impeller moves the flow through an outlet and tube back to the vessel. Once the desired consistency is reached, it is pumped out, while a de-watering screen saves the water for re-use.

About the Kruger Organization

The Kruger Organization is a 100-year-old global company operating under five business units. It manufactures and markets a variety of products related to pulp and paper, including: newsprint, specialty grades, lightweight coated paper, directory paper, tissue, recycled linerboard, corrugated containers, lumber and other wood products.

According to company literature, Kruger is the only manufacturer in the world to offer cellulose-based specialty products made from both wood and cotton.

K.T.G. (USA) LP – Memphis
The Kruger facility in Memphis, TN, the setting for the accompanying article, was once part of Scott Paper. When Scott eventually merged with Kimberly-Clark, the mill was idled. In 2002, when Kruger acquired the operation, it became known as K.T.G. (USA) LP, part of the Kruger Tissue Group (KTG), which manufactures premium tissue products, under its own brand names and private labels, for retail, industrial, business and institutional use.

The Memphis mill was restarted in 2003 and a major modernization plan was implemented. Today, with over 40 acres under roof, it is the largest structure in the city of Memphis and employs some 175 people. Main products manufactured here are bath and facial tissue. Equipment includes four paper machines and 10 converting lines.

1207_nightmare_fig1Hydropulper sealing
During the pulping process, material comes in contact with the rotor, a tremendous shock load is transferred to the shaft and it flexes the shaft of whatever sealing system is being used (contact, lip, labyrinth, etc.). To maintain the integrity of the seal, and keep contamination out, among other things, it must be able to accommodate shaft movement. Over time this action can break down even the best of seals. When a seal breakdown occurs, water runs past the component and down the shaft where it enters and contaminates the gearbox housing. Sealing options that have been tried on hydropulpers include [Ref 1]:


  • Lip seals—these dry running devices can wear out, break down or fall apart. Their short service life can be as little as 1800 hours. They actually can do damage by cutting into the shaft at the sealing point. Double lip seals can do twice the damage.
  • Sealed bearings—so-called (lubricated-for-life) bearings do not seal out moisture or water.
  • Fibrous packing—degrades and can fret the shaft.
  • Close clearance designs—still allow for humidity egress/ingress.
  • Contact face seals—stop contacting, produces gaps that allow for the movement of air and water across the bearing.
  • Flingers—rings that deflect leakage away from packed or sealed equipment are basically ineffective. In time, using any of these methods, water will be sufficient enough to get past the seal and into the gearbox bearings and cause the bearings to fail. In other words, the root cause of the failure was not addressed.

The Tissue Making Process In Brief

Tissue paper is a nonwoven fabric made from cellulose fiber pulp. (The Memphis KTG plant uses northern softwood and eucalyptus as the main fibers.) In the manufacturing process, fibers are broken up in a hydropulper and mixed in a cooking liquor with water and chemicals usually consisting of either calcium, magnesium, ammonia or sodium bisulfate.

The mixture is cooked into a viscous slurry. To whiten and brighten the pulp, bleaching agents, such as chlorine, peroxides or hydrosulfates are added. The pulp is washed and filtered multiple times until the fibers are completely free from contaminants. This blend of water and pulp is called the “furnish” stage.

The slurry then flows into a head box that spreads it out on a continuous wire mesh belt or Fourdrinier. As the fibers travel down the Fourdrinier, much of the water is drained out through the holes in the wire mesh. A series of other steps further compress the fibers and continue to remove water to a point where the sheet is strong enough to be transferred to a specially adapted tissue or Yankee dryer.

The highly polished Yankee dryer takes the wet sheet over a series of rollers until it is adequately dried. Along the way, raised supports on the line create bumps and valleys on the now completed fabric or “web.” The web passes through a series of rotating knives that cut it to the desired width that are folded and packaged in boxes or cellophane wrap.

The hydropulper’s problem
KTG’s Memphis plant operates five (Voith) hydropulpers that have been in service for approximately 40 years. The units were all retrofitted and modernized in the 1990s, including the gearboxes and motors. Still, they continued to experience ongoing breakdowns—and it was never a pretty sight (see Fig. 2 and Fig. 3).

According to Dave Knox, KTG maintenance planner who oversees maintenance on the plants, refiners, pulpers and paper machines, the main cause of the failure was water contamination in the gearbox. Mounted directly under the hydropulper tank, water entered through the output shaft, entered the bearing housing, contaminated the bearings and the gearbox failed. The problem had been recurring for years and was not solved by the previous owners.

When the mill restarted in 2003, so did the equipment failures. Although the maintenance team knew that the root cause of the failure was water contamination in the bearing housing, it felt that it just had to live with it. To complicate matters even further, because of the hard-to-access nature of the components, it was difficult to determine exactly when a contact seal might fail.

The problem continued because the standard overhaul procedure included the use of lip seals. While these components might have been brand new, right out of the box, the gearboxes would be doomed to fail again—it was just a matter of time. In fact, the problem of water contamination hindering the entire system was to continue until the true root cause of the failure was attacked two years ago—when the Memphis facility began to install bearing isolators in its hydropulpers.

Why lip seals fail
To understand the problem at KTG, one has to look at the history of lip seals. At the time they were first made available some 70 years ago, they were the only choice when it came to general-use sealing devices. Because of their inexpensive cost, over the years they became the number one choice for sealing industrial rotating equipment.

Today, according to their own manufacturers, even the best lip seals have a mean life to failure of only 1844 hours—or 77 days of operation. Half last longer than that and half last less than the mean time hours to leakage. This means that lip seals have a guaranteed failure rate of 100%.

As they experienced at KTG, no one can determine when the time is up for a lip seal. There simply is no advance warning. The only way to tell is after the equipment stops working and the lip seal has burned to a crisp and probably grooved the shaft.

Contact vs. non-contact
While a lip seal or contact seal operates with contact, the bearing isolator, a non-contacting labyrinth-type seal, makes no contact. It never wears out and can be used over and over for many years. With this in mind, it may not make sense to protect rotating equipment that is designed to run uninterrupted for years, with a product that could experience a 100% failure rate in a relatively short period of time.


Bearing isolators
In the late 1970s, an alternative to contact/lip seals was made available with the invention of the Bearing Isolator, a non-contact, non-wearing, permanent bearing protection device [Ref. 2].

The bearing isolator consists of two parts, a rotor and stator, which are unitized so they don’t separate from one another while in use. Typically, the rotor turns with a rotating shaft, while the stator is pressed into a bearing housing. The two components interact to keep contamination out of the bearing enclosure and the lubricant in—permanently.

1207_nightmare_fig3Today, bearing isolators are used to protect motor and pump bearings, machine tool spindles, turbines, fans, gearboxes, paper machine rolls and many other types of rotating and related equipment. Though the end-user has a choice, the best bearing isolators are made of metal, usually bronze, manufactured to specification, with a vaporblocking feature to inhibit the free transfer of contamination (see Fig. 4).

The hydropulper solution
When Dave Knox approached Mike Perkins, his Chesterton distributor, about the Memphis mill’s ongoing hydropulper breakdown problem, Perkins suggested trying Inpro/Seal branded bearing isolators. Following this recommendation, Knox met with Joe Klein, Inpro/Seal’s regional manager. Working together, Knox and Klein developed a plan of attack.

Bearing isolators were engineered and manufactured to the hydropulper drives’ exact needs and specifications. Between 2005 and 2006, these new devices were installed on two of the five hydropulpers as part of the overhaul program. For the last two years, the Memphis KTG site has not experienced a single hydropulper failure. That’s because the reason for their previous ongoing failures— water entering the gearbox housing—was totally eliminated. In the future, this type of bearing protection is expected to be applied to the remaining three hydropulpers.

The rest of the story
In addition to bearing isolators on its hydropulper drives, KTG also uses PMR bearing isolators on its paper machines. An acronym for paper machine roll, the PMR bearing isolator was specially engineered for the size, speed, alignment and operating conditions of wet and dry ends of machine rolls.

As with the hydropulpers, before the availability of bearing isolators, end users had to contend with sealing methods that allowed roll bearings to fail. The leading cause of this failure also was contamination entering the bearing housing—contamination from heat, humidity, paper stock, water and oil leakage.

The bottom line
K.T.G. (USA) LP – Memphis cleaned up the problem with its hydropulper breakdowns by keeping water out of the units’ bearing housings—the root cause of the failures. Key to this was replacing outdated sealing methods with stateof- the-art non-contacting technology.

Since it began installing Inpro/Seals two years ago, the Memphis operations have yet to experience a single breakdown on any bearing isolator-equipped hydropulper. Once the facility installs these devices on its other hydropulpers, breakdown due to water contamination should be totally eliminated.

One thing is certain—the installed bearing isolators will not experience unexpected breakdown [Ref. 3]. These well engineered components should run maintenance-free throughout their intended design life, which could be 20 years or more.

1. Before the advent of the bearing isolator.
2. David C. Orlowski holds the patent for the “bearing isolator,” a term he coined when he founded Inpro/Seal Company in 1976.
3. The first bearing isolators, installed in a process plant in Iowa over 20 years ago, are still operating. In addition, Inpro/Seal offers a full no questions asked warranty.
4. Based on available statistics.

Bearing Isolators Widely Accepted Worldwide

Almost three million Inpro/Seal-branded bearing-isolator designs are in operation in process plants around the globe, where end users continue to report significantly reduced operating costs with increased productivity and reliability. Protected bearings have proven to run 150,000 hours or more (17+ years), eliminating the need for continual maintenance and repair. Documented cases show that a plant can easily double the mean-time-between failure (MTBF) and reduce maintenance costs by at least half, with users reporting an extremely high Return On Investment (ROI).

Inpro/Seal Company ( the product’s originator, recently announced that its production capacity has increased to accommodate 40,000 bearing isolators per month, making it the largest producer of bearing isolators in the world [Ref. 4]. To supply this demand, the Rock Island, IL-based company’s campus, the largest of its kind, encompasses engineering, research, development, testing and manufacturing facilities operating on a 24/7 basis.

Dave Orlowski is founder, president and CEO of Inpro/Seal Company. E-mail:

Continue Reading →


6:00 am
December 1, 2007
Print Friendly

Maintenance Quarterly: Do You Really Know Where Your Machines Are?

Becoming a “Reliable Plant” and staying there requires keeping abreast of constantly changing and improving technologies and practices.

In today’s leaner maintenance departments, companies rely heavily on the reliability of their machinery. While the practice of reliability engineering has been around for many years, it has never been focused on as much as it is now. In today’s maintenance world, reliability engineering positions—not to mention entire departments—have been created to put 100% of their time and effort toward the prevention of unscheduled machinery downtime and critical failures.

Even though the goal of a “Reliable Plant” remains much the same as it has for years, methods and practices for getting to that state are constantly changing and improving with the development of new technologies and practices. A case in point is proper shaft alignment of rotating machinery in the running condition, through the derivation and application of proper coupling target values.

With today’s laser alignment tools and proper training, alignment of machinery has become an easier task than in years past. However, in some cases, companies are finding that even while machines are within excellent alignment tolerances, they still have problems associated with misalignment. This often is a result of thermal growth issues with the machine, dynamic loads, downstream (or upstream) piping movement and other variables.

Many manufacturers supply their equipment with thermal expansion data and recommended alignment targets. The idea is to purposely misalign a machine when the alignment is done “cold,” or offline, so that when the machine reaches its normal running condition the machine is aligned. Compensating with target values is one step closer to proper alignment, but often these values are not as accurate as they were originally intended to be, due to flaws in the methods of their calculation.

Hypothetical applications
Two identical steam turbine-hot water pump machine trains are sold and supplied with factory-calculated target values. It is late October. One unit is installed in a Louisiana refinery at 90 F, the other in an identical plant in Washington State at 40 F. Both operate at the same temperature, but which machine will be in alignment when it reaches its normal running condition? Consider that the factory calculated the target values using an arbitrary cold temperature of 70 F. Because of the temperature differences, it is possible that both units may be out of alignment at running condition using the factory supplied alignment targets.


Using the “TLC” thermal growth calculation method we can see how much the growth can differ depending on what the ambient temperature is when the alignment is performed. The TLC method is the product of the change in Temperature, the Length of material from base of machine to the centerline of rotation and the Coefficient of expansion for the material involved. Each support foot of each machine needs to be calculated. The calculations for one of the feet at each location are shown in Fig. 1.


These variations at the feet could mean an even greater misalignment at the coupling center, or point of power transmission. The graph in Fig. 2 is based on the thermal growth values shown in Fig 1. It illustrates how these growth values could result in even greater misalignment at the coupling center.

Dealing with “problem” machines
Many companies seem to have some “problem” machines that they too often accept as being uncorrectable. Extra spare parts become part of the yearly budget and it’s no surprise to anyone when those particular machines break a bearing or lose a seal every few months—while similar machines run without a problem for years.

This type of situation became clear for a South California refinery several years ago. As part of its growing reliability program, the refinery decided to do something about the site’s “problem” machines, as well as those machines without accurate target values. The company utilizes the best laser alignment tools and trains its employees to do correct alignment incorporating target values wherever necessary. Even with these good practices in place, however, some of the machines still have high-failure rates.


Whenever refinery personnel identify a machine that is still having problems with failures associated with misalignment, they install a system called PERMALIGN® to accurately measure any relative movement between the machines from cold to hot or normal running condition. This laser-based system measures and records any movement, whether across a coupling or an absolute movement relative to Earth, and is accurate to 1 micron. (It is the only linearized laser monitoring system with a resolution of 1 micron throughout the entire 0.630″ detector range.) The system measures any offset and angular movement over separations of up to 30′, so it can also record data on the site’s large cooling tower fans. Even in the harsh environment that the refinery offers, temperature variations and vibration do not diminish accuracy.

The data collected by the PERMALIGN system can be trended, analyzed and archived using software called WINPERMA®. This software uses the data to translate the relative machine movement into movement at the coupling center in both axes; Vertical Offset, Vertical Angularity, Horizontal Offset and Horizontal Angularity are calculated. A baseline established at the ambient temperature becomes the zero point, then the machines are turned on and allowed to reach their normal running condition. The graph in Fig. 3 shows all four axes of movement so the new alignment targets are easily read. Flags can be marked on the graph to record system events such as when the system was brought on-line, to mark different running loads, a valve opening or any other system event. Let’s look at a recent example of a “problem” machine where the California refinery utilized the PERMALIGN system to measure the movement across the coupling.

In one of its distillation units, the refinery has a set of residuum pumps that are vital to the continuous operation of the unit. If the pumps were to shut down unexpectedly, the whole process would follow suit—leading to a major shutdown, resulting in significantly higher repair cost than just replacing a bearing on a pump. Since these pumps are redundant, if one fails the other picks up the load. On the other hand, when one “problem” pump is out of commission for repair, there is no backup. Of the two pumps, only one of them has a very high failure rate. They are identical pumps and the reason(s) why one of them has a high failure rate and the other does not remains a mystery. They both are aligned using the factory recommended targets, yet only one pump continues to have bearing failures. Vibration readings also are significantly higher on the one pump compared to the other, and vibration analysis points to misalignment. While there are myriad possible causes for this problem, correcting it is the priority. Thus, the PERMALIGN system was installed on the unit to measure the relative movement of the pump and motor.

Once the system was installed on the unit and started recording data, a baseline was established. Since these pumps operate at a very high temperature, they are slowly brought up to operating temperature, as marked on the graph with an event flag. A second flag was placed to note when the pump was brought on line. As the pump reaches its normal operating condition and the data levels out—in this case about eight hours—it can be shut down and allowed to cool.

The data shown in the box near the center of the graph in Fig. 3 are the new target values used for the alignment. These targets were input into the refinery’s ROTALIGN® ULTRA shaft alignment system and the alignment was performed once the unit cooled to ambient temperature. The unit was then put back on line.


A four-month trend of the overall velocity levels measured on the pump using the VIBXPERT® vibration data collector is shown in Fig. 4. The final reading on the trend was taken several days after the alignment was performed using the new target values.

After further investigation into the root cause of the problem pump, it was found that the concrete base had been cracked during a repair on an adjacent machine several years earlier. After the base was repaired, the “cold” position had apparently moved from its original setting, causing the targets to change. This cause was luckily found by a senior millwright reporting the repair after overhearing a conversation concerning the investigation. There was no documentation of the accidental damage or of the repair, so this information may never have been known if not for the millwright coming forward.

Utilizing the latest technologies, the refinery was able to identify a piece of critical machinery that had uncommon characteristics and quickly apply an accurate solution. A complete maintenance history of the machines is now stored in the site’s alignment and condition monitoring software. Proper use of these tools has put this refinery one step closer to what it truly wants to be—a Reliable Plant!

Deron Jozokos is an engineer with LUDECA, INC. Telephone: (305) 591-8935; e-mail:

Continue Reading →


6:00 am
December 1, 2007
Print Friendly

Viewpoint: Achieving Excellence


Richard L. Dunn, Executive Director, Foundation for Industrial Maintenance Excellence

Two U.S. plants have been selected to receive the 2007 North American Maintenance Excellence (NAME) Award presented by the Foundation for Industrial Maintenance Excellence. The Alcoa Mt. Holly plant, Goose Creek, SC, and the Baldor Dodge Reliance – Dodge Marion plant, Marion, NC, were selected as award winners after evaluation of their applications and onsite audits of their operations by the NAME Award Board of Directors.

Now in its seventeenth year, the NAME Award is widely regarded as the most prestigious recognition in the maintenance function. Awards are presented to individual plants on the basis of their maintenance departments’ ability to provide “capacity assurance for operational excellence” in the areas of organization, work processes and materials management.

In many ways, the two winners represent the breadth of the possible paths to maintenance excellence. One is a large plant, the other small; one a large maintenance organization, the other not. One plant is primarily a round-the-clock continuous process operation, the other a manufacturer of discrete products. One has a long tradition of striving for and exemplifying maintenance excellence, the other has come to this level only recently.

Alcoa Mt. Holly is a 1.5 million-square-foot aluminum smelter that produces about 500 million pounds of aluminum ingots annually. Its 160 maintenance employees support the 24/7 operation of the plant through a wide variety of preventive and predictive maintenance activities, major equipment overhauls and operation and maintenance of the plant’s substation. In recommending the plant for the NAME Award, evaluators noted its long history of outstanding work planning and scheduling, as well as its excellent communications and cooperation with all production areas.

Dodge Marion manufactures mounted tapered/spherical roller bearings in its 174,000- square-foot facility. Its nine-person maintenance department has developed a strong preventive and predictive maintenance program using various total productive maintenance (TPM) processes.

Both plants have demonstrated enviable records for reliability. Furthermore, both demonstrate that a foundation of sound preventive maintenance practices coupled with a plant-wide respect for the value of maintenance is essential to overall excellence.

Established in 1990 as a way to encourage best maintenance practices and a way to honor those who achieve them, the NAME Award program has presented 20 awards over the years with several awards in some years and none in others. In 2000, the volunteers who administer the award program incorporated as the not-for-profit Foundation for Industrial Maintenance Excellence (FIME) to ensure the program’s continuance and independence from commercial influence. The Board of Directors is made up of past award winners and others with a demonstrated devotion to the values the award represents.

To be eligible, a plant must submit a comprehensive application by June 30 in the year of entry. This application is reviewed by the Board of Directors to determine eligibility for an onsite audit. Following this audit, the Board of Directors again meets to decide if the applicant qualifies in all respects for the award.

The NAME Award recognizes that the Alcoa Mt. Holly and Dodge Marion plants have demonstrated their maintenance competence at a world-class level. The Foundation for Industrial Maintenance Excellence is proud to honor their achievements.

Rick Dunn participated in the establishment of the North American Maintenance Excellence Award and has been active in its activities since inception. He was appointed Executive Director when the NAME Award program was incorporated as the Foundation for Industrial Maintenance Excellence. Information on the NAME Award program is available online at

Continue Reading →