“We have too many equipment problems: Too much downtime, too many breakdowns. Maintenance is out of control around here!” What does that mean?
Why is it that, in many cases, “downtime” is perceived as “maintenance downtime”—or in other words, time for maintenance to come and fix something that should not have failed? Is it “normal” for maintenance to be blamed while the actual causes of the downtime and other equipment-related losses go unseen? When maintenance is incorrectly blamed, it may keep the true causes and other losses hidden inside an equipment performance-history time warp.
In many cases, maintenance by itself is unable to eliminate the problems because the causes are simply outside its control. While breakdowns and downtime may be clear, all too often, many chronic interruptions (and causes) remain concealed from view. Despite the best of intentions, they go undetected or overlooked and continue to plague an operation’s reliability. In our never-ending quest for improved reliability and competitiveness, this costly game of “hide-and-seek” must end.
Let’s begin our game of hide-and-seek by defining types of equipment-related losses. First, the obvious ones:
- Scheduled downtime
- Scheduled maintenance shutdowns
- Equipment failures or breakdowns
- Startup and adjustment
- Routine tooling or part changes
Then there are the often not-so-obvious losses:
- Waiting for operators
- No incoming raw materials (kanban empty)
- No room for output (kanban full)
Equipment losses frequently result from running at less than capacity:
- Minor stops (jams, breaker trips, idling)
- Reduced speed or cycle time
- Operational interruptions
Even when equipment appears to be running just fine, it can be causing product losses:
- Scrap or damaged output
- Off-spec product that is reworked
- Yield losses due to startups and changeovers
The equipment could be considered 100% reliable when you factor these 15 losses out of the equation. Is that possible? It sure is! The common gap is NOT having the data that shows where the downtime is—and which losses are the most penalizing. Or the opposite: That is, having TOO MUCH DATA, thus making it practically impossible to sort things out. The most important consideration is to concentrate on the most critical processes and the problem-prone, constraint equipment within them. Stay focused…
OK, so we’re staying focused, collecting and analyzing all the data. We still have to deal with “hidden losses,” though (i.e., losses that are silently eroding equipment utilization and increasing operating costs). Let’s start with a basic hour-by-hour equipment-operating timeline covering 422 hours:
Hour 000: Equipment startup
Hour 285: Equipment failure
Hour 299: Equipment startup
Hour 300: Operational
Hour 348: Operational
Hour 396: Operational
Hour 420: Planned maintenance (PM)
Hour 422: Operational
Equipment failure and repair…
The obvious equipment loss occurred between the scheduled operating hours of 285 and 299. This 14-hour downtime was attributed to an “equipment failure.” What really happened during those hours? We need to drill a bit deeper into the hidden-loss time warp.
- Call maintenance
- Fill out downtime report
- Maintenance dispatch looks for mechanic
- Mechanic closes up current work (to be completed later)
- Mechanic and planner assess the problem
- Electrician called to assess control issues (potential problem)
- Electrician tests controls, looks for faults (finds none)
- Lock, tag, try
- Secure parts and tools
- Call for “rush” part delivery from local supplier
- Clear product out of the equipment
- Repairs actually begin
- Process technician tests equipment
- Mechanic makes fine adjustments
- Instrument is re-calibrated
- Repair completed
Hours 299 to 420:
- Operational (121 hours)
Can you spot the hidden losses? What happened between hours 285 and 288? There were three hours of “stuff” being done before the repairs actually began. How much of that is preventable? Without more data and further analysis we could only guess.
Hide-and-seek: Document what actually happened. At a minimum, the maintenance work order should capture man-hours worked, who worked, parts and supplies used, description of the actual problem, likely cause and corrective action taken. The production report should capture the downtime event start- and end-time, products being run, process settings, reasons or causes for downtime, maintenance work-order number, etc. Here are some improvements to consider:
- Keep spare parts IN the plant rather than with your suppliers.
- Keep critical spare parts and special tools in a secure area AT the equipment location.
- Encourage operators and front-line supervisors to more accurately identify the problem(s).
What actually happened between hours of 288 and 298? This was the actual “repair time” (wrench time) that also included “test” time by the process technician, re-calibration and fine-tuning. Again, without more data and further analysis we could only guess. Ask these questions:
- How much time was spent “waiting” for parts?
- How much time was spent waiting for the process technician?
- How much time was spent looking for information (prints, manuals)?
Hide-and-seek: What actually happened in that hour from 298 to 299? Were folks looking for the operators who went on break? Was it shift-change time? Was more paperwork being done?
During the total 14 hours of downtime, could it have been that there were three hours of unproductive activities or wasted time? If so, that means the loss could have been three hours less. Those additional hours of operational time could have a sizeable advantage for the business.
An even bigger question: Could the cause of the downtime have been prevented altogether? Consider the likelihood of a more effective preventive maintenance program, more robust replacement parts, better-trained operators, variations or flaws in raw materials, etc.
Planned maintenance losses
I tend to think of “planned maintenance” like a pit stop in racing. The priority has to be “right the first time, every time.” And the speed—or efficiency—has to be as fast as possible (as long as effectiveness or accuracy is not compromised). Look again at the foregoing operational timeline (the PM from hours 420 to 422). Suppose this is the typical two-hour PM performed by maintenance.
Hour 420: Shut down for two-hour PM
- PM paperwork
- Inspect equipment
- Lock, tag, try
- Get parts
- PM tasks on the equipment
- Remove lock and tag
- Clean up
- Wait for operator
- Check out and test the equipment
- Wait for supervisor to sign off the paperwork
Hour 422: Operational
Hide-and-seek: Are there any hidden losses in this two-hour PM? Here are a few improvements to consider:
- Inspect the equipment BEFORE the PM actually begins.
- Get the PM paperwork set BEFORE the PM begins.
- Get the parts before starting the PM (kitting helps here).
- Involve the operator WITH the PM tasks.
- Allow the operator to sign off on the PM completion.
In many plants, the BIGGEST hidden losses are lurking just under the surface in “Operational Availability.” They’re hidden so deep in the data (or absence of data) that they’re almost undetectable—but they are there all the same.
Hide-and-seek: In our operational time line, what actually happened between operating hours 299 and 420? Was it truly 100% operational running time or “utilized” for 121 hours? That’s 15.125 shifts (days) of uninterrupted operation. Here’s what we might find if we were to dig deeper:
Hidden Interruptions within Operational Availability:
- Operator-performed maintenance (OPM) @ 0.25hrs/day x 15 days = 3.75 hours
- Daily startup @ 0.25 hrs/day x 15 days = 3.75 hours
- Daily breaks/meals @ 1.0 hrs/day x 15 days = 15.00 hours
- Daily safety talk @ 0.25 hrs/day x 15 days = 3.75 hours
- Waiting: kanban full and kanban empty 20 times = 1.40 hours
- QC/lab checks, 4x per day @ 0.1 hrs each x 15 day = 6.00 hours
- Benefi ts meeting by HR = 1.00 hour
- TPM meeting (monthly) = 2.00 hours
- TOTAL OPERATIONAL DOWNTIME = 36.65 HOURS
Hide-and-seek: Look at the hidden losses above (likely unseen because nobody ever looked for them). What’s NOT measured here are the “minor stops” caused by jams, circuit-breaker trips, control trips and resets—or waiting on something, someone or some information to proceed with operations. These types of delays disrupt steady-state, reliable operation. In many cases, such interruptions are rationalized or explained away as “normal” operating situations. If this equipment is not a process constraint, why focus on these types of losses? They don’t really penalize the business. Or do they? What if some of these losses could be reduced or eliminated? The business would produce more in a shorter operating time as increasing numbers of needless hidden losses are eliminated. That is a real business advantage.
If this equipment IS a process constraint, however, these hidden losses could have a significant impact on the overall process fl ow. Because such losses are represented by “just-a-little-time-here” and “a-little-time-there,” this form of downtime rarely appears as one lump. Yet, when you total 36.65 hours of actual downtime (and when it occurs at one time) everyone will rally around eliminating the problem: “Call maintenance!” Such problems clearly are NOT maintenance problems.
As Rick Hendrick said to me years ago about his NASCAR race teams, “We win or lose together.” It’s the same for those of us working in industrial facilities: The little “operational” problems—the little “operational” losses—ARE our problem. They belong to everyone in the plant. It’s not about pointing fingers and blaming others, either, but rather about getting to the causes of the most penalizing problems and eliminating them in a cost-effective manner.
We must seek out new ways to make our equipment and processes more reliable if we are to maintain a competitive advantage or grow our businesses. In many operations, a “hidden capacity” often goes untapped. If that capacity isn’t exploited, investment in new equipment may be required (at a high cost).
The quest for hidden losses was a core principle taught in the early days of Total Productive Maintenance (TPM). No matter what we call our efforts, rooting out hidden losses must become a top priority. What was of key importance in the early days of TPM was the emphasis on TOTAL (i.e., everyone involved in the elimination of the major losses and causes of poor performance). Keep in mind, maintenance alone can’t make equipment reliable if the causes of “unreliability” are outside its control. MT