I’ve written before about my fascination with the CSI (Crime Scene Investigation)-franchised television series. Recognized as some of the most-watched TV dramas in the world in 2011, its plots are modeled on the classic “whodunit” format popularized in Sherlock Holmes stories.
Week after week, teams of forensic specialists from Las Vegas, Miami and New York investigate human failure at its worst, relying, in the process, on a seductive mix of cutting-edge technology and good old common sense to find clues and answers to what frequently are complex questions.
Fans of the CSI shows who work in the maintenance arena no doubt recognize parallels to the asset reliability field: In our world, the maintainer is the FSI (or Failure Scene Investigator) armed with a preventive, predictive and diagnostic/forensic toolkit to examine and determine a failure’s root-cause culprit and put in place a strategy to avert similar failure occurrences in the future.
Each CSI episode showcases innovative approaches to troubleshooting through the use of diagnostic technology and attention to crime-scene details. Although the characters may take license with technological capabilities from time to time, they do successfully portray the real-life investigator’s mission to solve every crime and make it difficult for perpetrators to escape—and also deter future criminal activity.
The majority of maintenance departments I’ve worked with have excelled in a reactive approach to maintaining assets, spending most of their time reacting to failure, choosing to repair and replace and do whatever is needed to “keep the machines running at all costs.” Sound familiar?
Assuring maximum asset availability calls for a cold, calculated, proactive maintenance approach. That, in turn, demands an innovative view of
failure. By changing our own mandate to one that resembles a CSI approach—i.e., “Solve and recognize every asset failure and through understanding, develop a strategy to deter this failure resulting in any future unplanned loss of service”—we automatically thrust ourselves into a proactive role. This type of approach harmonizes and validates the use of preventive, predictive, and diagnostic strategies with our repair capabilities driven wholly by the asset’s needs!
Any time a piece of equipment or component fails, it leaves behind an evidence trail that can be documented and followed to determine the root failure cause and, in the process, fuel development of a suitable failure prevention/prediction/management strategy. Even though CSI junkies know we must “protect the crime [failure] scene at all costs,” in our haste to “keep the equipment running at all costs,” we often destroy the failure scene—and either contaminate or throw out the evidence. Moreover, few maintenance departments photograph failure scenes and bag and tag damaged or failed parts for post-repair forensic analyses.
By nature of the job, every maintainer is an FSI, responsible for equipment reliability through better understanding of equipment failure. If we are to significantly reduce our levels of maintainability while increasing availability and reliability levels, we must work toward the systematic investigation of every equipment failure.
Incorporating a law-enforcement, CSI-type of approach, the following eight steps lay out an innovative procedure manual for better understanding and dealing with equipment failure.
- Secure the scene. Work with operations to perform a qualitative evaluation of a failure scene before commencing repairs and/or restarting the equipment.
- Photograph the scene. “A picture is worth a thousand words” could not be truer when it comes to a failure investigation. Photos allow the failure scene to be revisited well after the equipment is back up and running, and act as excellent training materials for preventing future failures. (Place a 6” rule against photographed items to help assess scale.)
- Perform on-scene forensics. The maintenance/reliability group can conduct many diagnostics at the failure scene (including infrared signatures, oil analysis signatures, etc.).
- Bag and tag all physical failure evidence. Once all local physical evidence of tampering and breakage has been photographed, tagged and bagged, the actual failed components can be dismantled and replaced. Any parts for repair must be photographed. Any parts requiring replacement must also be bagged and tagged. Use bubble wrap, heavy-duty freezer bags and heavy-duty cling wrap (for larger items) to protect components and evidence.
- Interview witnesses. Operators can describe any abnormal sound, smell or vibration emanating from the equipment prior to failure.
- Code the failure on the work order. Complete the work order with a report of the findings, making sure to include failure symptom codes on it.
- Perform necessary laboratory forensic analysis. Examine all past failure records and diagnostic readings and conduct any necessary destructive testing and metallurgical and/or oil analysis, etc., by sending out to a recognized lab.
- Analyze findings. Write up an FMEA report with recommendations for preventing or predicting and update the PM program accordingly.
Adopting the FSI methodology listed above requires a disciplined “planned and scheduled” approach to be in place—and that’s a very important step toward maintenance excellence. MT
Ken Bannister is author of Lubrication for Industry and the Lubrication section of the 28th edition Machinery’s Handbook. He’s also a Contributing Editor for Lubrication Management & Technology. Email: firstname.lastname@example.org.