MEC&F Expert Engineers : ROOT CAUSE ANALYSIS ON THE CAUSE(S) AND CONTRIBUTING FACTORS OF EQUIPMENT AND MACHINERY FAILURE

Thursday, October 16, 2014

ROOT CAUSE ANALYSIS ON THE CAUSE(S) AND CONTRIBUTING FACTORS OF EQUIPMENT AND MACHINERY FAILURE

CAUSE(S) AND CONTRIBUTING FACTORS OF EQUIPMENT AND MACHINERY FAILURE
https://sites.google.com/site/metropolitanforensics/cause-s-and-contributing-factors-of-equipment-and-machinery-failure
Most standard commercial property insurance policies contain the following basic exclusions:
·         Explosion of steam boilers, steam engines, steam turbines, or vessels under steam pressure;
·         Artificially generated electric currents; - arcing, or short circuiting – of motors, generators, circuit breakers, electrical distribution boards, cables, and transformers;
·         Mechanical breakdown, and
·         Centrifugal force
Any loss (such physical damage to the equipment, business interruption, spoilage, etc.) resulting from these causes of failure might not be covered by a property insurance policy.  As a result, most industrial, utility, commercial, institutional, processing and light manufacturing risks carry the so called Equipment Breakdown Insurance.  Many insureds carry insurance for boilers and air conditioning, while ignoring to also insure other equipment risks such as transformers, generators, pumps, compressors, and so on.  Furthermore, many insureds have equipment insurance but fail to obtain insurance for business interruption, spoilage or extra expense.

Example of arcing damage to equipment not covered by standard property damage policies

A typical commercial property insurance will provide property damage coverage by including the peril of accidental breakdown that is:
1.      Sudden;
2.      Accidental;
3.      Manifests itself in physical damage to the equipment and necessitates repair or replacement of the equipment or part thereof


Important restrictions to the above definition of the covered peril are:
                            Water damage caused by worn water supply pipes
·         Wear and tear;
·         Cracking of certain parts of gas turbines;
·         Leakage at valves, seals or fittings;
·         Corrosion of the equipment components;
·         Depletion, deterioration or erosion of the equipment;
·         Failure of a safety devise, such pressure or vacuum relief valve;
·         Breakdown of certain electronic components;
·         Combustion explosions;
·         Faulty or improper material, workmanship or design;
·         Pollution or contamination;
·         Gradual deterioration, latent defect or inherent vice

From the above list of exclusions it can be seen clearly that an Equipment Breakdown Insurance is an essential insurance for all properties that use equipment for heating, cooling, process, etc.  Equipment Breakdown Insurance is a form of property damage insurance and its purpose is to insure against the financial losses, such as property damage, business interruption, extra expense and spoilage (consequential damage) losses that result from defined accidents to specified kinds of mechanical, electrical and pressure equipment. 
Situations that we have investigated often include an accidental arcing, followed by fire and damage to the equipment and the building.  In situations like that, the Equipment Breakdown insurance will pay for the damage to the electric cable and the electric switch, while the property insurance will pay for the damage caused by the fire.
In general terms the Equipment Breakdown Endorsement adds the three (3) perils: mechanical, electrical and pressure equipment breakdown to the property coverage or causes of loss forms by amending some coverage exclusions, such as the exclusions for mechanical, electrical and pressure systems breakdown loss.  As an example, the property coverage insurance typically provides for the protection against explosion of hot water boilers; the Equipment Breakdown Endorsement will then add coverage for explosion of steam boilers and breakdown of other types of pressure equipment that may be found in various occupancies.
The cause of damage investigation
During an investigation of a loss caused by equipment or machinery failure, the most important questions that always must be answered by an investigating expert are:
·         What is the cause or causes of the failure (loss) and how it happened;
·         If there are multiple causes of failure, is there a direct (or proximate) cause, and what is the sequence of the failure events?
·         Are there any contributing factors that led to the failure (loss)?
Equipment failure claims can be very complicated because of the magnitude of the loss, the age of equipment, the lack of maintenance or repair records, any prior damages or electrical or mechanical failures, any product recalls or defects and so on.  These types of losses also present significant subrogation potential, i.e., trying to recover the loss from a responsible third party or parties.


Failed boiler tube

The Investigation of Equipment Failures – Proximate Cause
According to the International Risk Management Institute, Inc. (IRMI), in insurance terms, proximate cause is the cause having the most significant impact in bringing about the loss under a first-party property insurance policy, when two or more independent perils operate at the same time (i.e., concurrently) to produce a loss.  Courts employ a set of proximate cause rules to resolve causation disputes when a property policy states that it covers or excludes losses "caused by" a peril and there is more than one peril at work in a fact pattern.  Under common law, whether the policy provides coverage depends on which peril is chosen as the proximate cause. If the peril selected as the proximate cause is covered, courts consider the loss to have been caused by the covered peril and will hold that the loss is covered.  If the peril selected as the proximate cause is uncovered or excluded, courts consider the loss to have been caused by the uncovered or excluded peril and will hold that the loss is not covered.
As a principle of tort law, proximate cause refers to a doctrine by which a plaintiff must prove that the defendant's actions set in motion a relatively short chain of events that could have reasonably been anticipated to lead to the plaintiff's damages.  If the defendant's actions were "proximate" or close enough in the chain of causation to have foreseeably led to the plaintiff's damages, courts will impose liability.  Otherwise, if the defendant's actions set in motion a long, bizarre chain of events that could not have reasonably been foreseen to lead to the plaintiff's damages, courts will not impose liability. In tort law, multiple actions by one or more defendants that are a substantial factor in producing the loss can qualify as proximate causes.
In the engineering profession, proximate cause(s), also known as the direct cause, are the event(s) that occurred, including any condition(s) that existed immediately before the undesired outcome, directly resulted in its occurrence and, if eliminated or modified, would have prevented the undesired outcome.  They can be equipment (e.g., defective seal) or human based (e.g. negligent or incorrect maintenance of equipment).   Also in engineering terms, root cause(s) are one of multiple factors (events, conditions or organizational factors) that contributed to or created the proximate cause and subsequent undesired outcome and, if eliminated, or modified would have prevented the undesired outcome.  Typically multiple root causes contribute to an undesired outcome.
The challenge in many failure investigations is to sort through an over-abundance of meaningless data and an absence of essential data to arrive at the most likely root cause(s) of a known failure from a seemingly endless list of possible failure mechanisms.  This is where a fault tree comes in handy.  We develop a fault tree at the outset of each failure investigation. The fault tree starts with the most basic observation; e.g., a steel beam is bent, and then each of the potential causes for the observed failure are listed below the top level (e.g., the beam was bent when it arrived at the site, the beam was bent during erection, the applied forces are too large, etc.).  Then the potential causes for each of those causes are added to the next level in the tree.  Each box in the tree may suggest another underlying root cause/causes or an analysis, measurement, or record check that can confirm or eliminate that item as a potential root cause.  A thorough and systematic approach is the key to a successful failure investigation.


Pitting corrosion in a boiler tube

In analyzing an equipment failure case, it is important to gather data and evaluate the maintenance history of the equipment.  Lack of or improper maintenance is a leading cause of failure of equipment.  For example, lack of maintenance of hoses carrying hydraulic fluid can be problematic, as these rubber hoses become brittle and prone to cracking over time, creating the potential for the failed hose to discharge pressurized, flammable hydraulic fluid onto hot parts of the equipment’s motor or exhaust system, resulting in a fire.  Thus, one of the first steps in evaluating the recovery potential of an equipment claim is to determine whether the equipment was properly maintained.  A diligent insured will keep log books recording each instance of maintenance to the equipment.  The insurer should request such maintenance records from the insured at the outset of such a claim, as those records can substantially inform the subsequent investigation.
During our root cause analysis investigations, we utilize a variety of tools to assist us in the determination of the cause of failure.  Some of these tools include:
·         Spreadsheets and/or flow charts, illustrating the equipment in the process.  The charts could also include the associated individuals connected to each equipment component;
·         Fault Tree Analysis – it is deductive reasoning method (from generic to specific information) for determining the causes of a loss and different mechanisms or contributing factors to the failure;
·         Multi-linear Events Sequencing – this tool identifies main actor(s), their duties and responsibilities as they relate to the loss or failed equipment;
·         Interviews;
·         Checklists;
·         Records and document reviews



Corroded pipe

Equipment Type and Example Failure Modes
The most common failure modes of the equipment we have inspected are provided below.
Boilers and fired pressure vessels
·         Local Corrosion leading to metal wall thinning;
·         Through-wall corrosion leading to wall thinning;
·         Excessive distortion;
·         Stress/Fatigue;
·         Stress Corrosion Cracking (SCC);
·         Fireside corrosion;
·         Corrosion erosion;
·         Excessive leakage;
·         Small crack;
·         Leaking through wall crack;
·         Safety/relief Valve Failure;
·         Rupture/bursting/cracking due to overpressure;
·         Implosion;
·         Low water level;
·         Low Water Cutoff Failure;
·         Overheating




High pH gouging on boiler pipes

Unfired vessels (hot water tanks, air tanks, cookers, process vessels)
·         Rupture/bulging/cracking due to overpressure;
·         Local Corrosion leading to metal wall thinning;
·         Through-wall corrosion leading to wall thinning;
·         Safety/relief Valve Failure;
·         Vacuum collapse;
Refrigerating and air conditioning, vessels and piping
·         Rupture/cracking due to vibration;
·         Corrosion;
·         Support failure
Piping (steam, air, etc.)
·         Rupture/cracking due to vibration and/or stress corrosion cracking;
·         Corrosion;
·         Support failure
Electrical motors, generators and other rotating electrical equipment, switchboards, cables, bus ducts, circuit breakers
·         Electrical motor burnout due to power surge;
·         Burned bearings due to line surge;
·         Arcing;




Examples of arc fault damage

·         Single Phasing;
·         Loose or corroded connections;
·         Excessive moisture or dirt accumulation;
·         Brittle insulation and breakdown;
·         Stress/Fatigue;
·         Ventilation problems
Centrifugal compressors, pumps, fans, blowers
·         Electrical burnout;
·         Burned bearings due to misalignment;
·         Bearing Failure;
·         Piston Failure;
·         Impact;
·         Molten Material;
·         Luck or Loss of lubrication;
·         Overspeed
·         Operator Error;
·         Cracking;
·         Stress/Fatigue



Piston failure

Reciprocating compressors, pumps, internal combustion engines
·         Cylinder/shaft/damaged rod or valve breakage due to liquid slugging;
·         Contaminated oil, seizing
·         Electrical burnout;
·         Burned bearings due to misalignment;
·         Bearing Failure;
·         Piston Failure;
·         Impact;
·         Molten Material;
·         Luck or Loss of lubrication;
·         Overspeed
·         Operator Error;
·         Cracking;
·         Stress/Fatigue

Turbines
·         Blading/shaft/jacket/frame damage due to shroud ring failure;
·         Imbalance;
·         Stress Corrosion Cracking
·         Overspeed



Severe misalignment can cause macropitting on helical pinion gears

Gears, gear sets
·         Broken teeth;
·         Burned bearings due to vibration;
·         Pitting; Spalling
·         Scoring
·         Misalignment;
·         Abrasive wear; Corrosive Wear;
·         Normal Wear
·         Metal fatigue;
·         Contaminated oil;
·         Overload


 
Miscellaneous machines (i.e., paper machines, hydraulic presses, extruders, production machines)
·         Breaking of moving parts/frame damage due to metal fatigue;
·          thinning of parts under pressure
Transformers
·         Electrical burnout/winding failure due to line surge;
·         Excessive moisture and/or dirt;
·         Overload

Prevention of equipment breakdown focuses on inspection (most is required by law) and user training.  Based on our forensic investigations we have found the following:
·         Operator error is biggest contributor to accidents;
·         Faulty or missing maintenance is a major contributor to equipment failures;
·         Faulty design, manufacturing or installation, even improper equipment or improper repair are other major causes of the failure
Often times, preventive maintenance and better corrosion protection significantly reduces the risk of failure.  When comes to equipment, most certainly a pound of prevention worth its weight in gold.


Subrogation Recovery Theories Involving Equipment Failure
Equipment failures are subject to subrogation recovery.  Subrogation may be pursued against the manufacturer(s) of the equipment for defective design or for manufacturing defects or failure to provide proper instructions; the parts manufacturers for the same reasons; the seller of the equipment; the installation contractor for improper installation; and the service or maintenance contractor if a service contract was used to maintain the equipment.  Quite a few of the electric equipment are damaged due to electrical problems, such as improper electric installation or incorrect voltage applied to the motor, or power surges during storms or transformer blowouts.
The professionals at Metropolitan are regularly retained to investigate and analyze the failure of all types of mechanical equipment; from heavy industrial machinery, to small and medium sized commercial appliances, as well as, commercial and residential systems.  Mechanical failures often involve damaged property, injury, loss of life, corporate down-time, and other collateral damage.  Once retained, our highly qualified engineers immediately assess the faulty/failed equipment in order to isolate the cause and determine why it is not working or not working to the standard set by the manufacturer.  Many factors contributing to mechanical failures have been presented earlier, such as: improper maintenance and assembly, excessive vibration, wear and tear, operator error, or flaws in the material, design, or electronics of the system or associated systems.  If the failure has rendered the equipment a complete loss, we can make process recommendations and assess subrogation potential.
It is very critical that the insurance adjuster instructs the insured not to begin repairs until the condition of the failed equipment can be inspected, examined and documented to determine the mode of failure and cause of the loss.  A proper investigation into the cause of the loss is very important.  It should be conducted as soon as possible to make certain that all relevant evidence and information is identified, collected and preserved.



Imploded tank due to failure of the relief valves

Examples of Forensic Investigations:  Boiler Tube Failure
At times, the cause of a failure cannot be readily determined, making it difficult to determine the appropriate corrective action.  A detailed examination of the failure and associated operating data is usually helpful in identifying the mechanism of failure so that corrective action may be taken.

Proper investigative procedures are needed for accurate metallurgical analyses of boiler tubes.  Depending on the specific case, macroscopic examination combined with chemical analysis and microscopic analysis of the metal may be needed to assess the primary failure mechanism(s).  When a failed tube section is removed from a boiler, care must be taken to prevent contamination of deposits and damage to the failed zones.  Also, the tube should be properly labeled with its location and orientation.

The first step in the lab investigation is a thorough visual examination.  Both the fireside and the waterside surfaces should be inspected for failure or indications of imminent failure. Photographic documentation of the as-received condition of tubing can be used in the correlation and interpretation of data obtained during the investigation. Particular attention should be paid to color and texture of deposits, fracture surface location and morphology, and metal surface contour. A stereo microscope allows detailed examination under low-power magnification.

Corrosion of the flange bolts.

Dimensional analysis of a failed tube is important.  Calipers and point micrometers are valuable tools that allow quantitative assessment of failure characteristics such as bulging, wall thinning at a rupture lip, and corrosion damage. The extent of ductile expansion and/or oxide formation can provide clues toward determining the primary failure mechanism. External wall thinning from fireside erosion or corrosion mechanisms can result in tube ruptures which often mimic the appearance of overheating damage. In those cases, dimensional analysis of adjacent areas can help to determine whether or not significant external wall thinning occurred prior to failure.  A photograph of a tube cross section taken immediately adjacent to a failure site can assist in dimensional analysis and provide clear-cut documentation.








The extent, orientation, and frequency of tube surface cracking can be helpful in pinpointing a failure mechanism. While overheating damage typically causes longitudinal cracks, fatigue damage commonly results in cracks that run transverse to the tube axis. In particular, zones adjacent to welded supports should be examined closely for cracks. Nondestructive testing (e.g., magnetic particle or dye penetrant inspection) may be necessary to identify and assess the extent of cracking.

When proper water chemistry guidelines are maintained, the waterside surfaces of boiler tubes are coated with a thin protective layer of black magnetite.  Excessive waterside deposition can lead to higher-than-design metal temperatures and eventual tube failure. Quantitative analysis of the internal tube surface commonly involves determination of the deposit-weight density (DWD) value and deposit thickness.  Interpretation of these values can define the role of internal deposits in a failure mechanism. DWD values are also used to determine whether or not chemical cleaning of boiler tubing is required. In addition, the tube surface may be thoroughly cleaned by means of glass bead blasting during DWD testing. This facilitates accurate assessment of waterside or fireside corrosion damage (e.g., pitting, gouging) that may be hidden by deposits.






The presence of unusual deposition patterns on a waterside surface can be an indication that non-optimal circulation patterns exist in a boiler tube. For example, longitudinal tracking of deposits in a horizontal roof tube may indicate steam blanketing conditions.  Steam blanketing, which results when conditions permit stratified flow of steam and water in a given tube, can lead to accelerated corrosion damage (e.g., wall thinning and/or gouging) and tube failure.

When excessive internal deposits are present in a tube, accurate chemical analyses can be used to determine the source of the problem and the steps necessary for correction. Whenever possible, it is advisable to collect a "bulk" composition, by scraping and crimping the tube and collecting a cross section of the deposit for chemical analysis. Typically, a loss-on-ignition (LOI) value is also determined for the waterside deposit. The LOI value, which represents the weight loss obtained after the deposit is heated in a furnace, can be used to diagnose contamination of the waterside deposit by organic material.






In many cases, chemical analysis of a deposit from a specific area is desired.  Scanning electron microscope-energy dispersive spectroscopy (SEM-EDS) is a versatile technique that allows inorganic chemical analysis on a microscopic scale.  

For example, SEM-EDS can be useful in the following determinations:

·         differences in deposit composition between corroded and non-corroded areas on a tube surface

·         the extent to which under-deposit concentration of boiler salts on heat transfer surfaces is promoting corrosion damage

·         elemental differences between visually different tube surface deposits

Inorganic analyses through SEM-EDS can also be performed on ground and polished cross sections of a tube covered with thick layers of waterside deposit. This testing is called elemental mapping and is particularly valuable when the deposits are multilayered. Similar to the examination of rings on a tree, cross-sectional analysis of boiler deposits can identify periods when there have been upsets in water chemistry, and thereby provides data to help determine exactly how and when deposits formed. With elemental mapping, the spatial distribution of elements in a deposit cross section is represented by color-coded dot maps. Separate elements of interest can be represented by individual maps, or selected combinations of elements can be represented on composite maps.

A scanning electron microscope (SEM) can also be utilized to analyze the topography of surface deposits and/or morphology of fracture surfaces. Fractography is particularly helpful in classifying a failure mode. For example, microscopic features of a fracture surface can reveal whether the steel failed in a brittle or ductile manner, whether cracks propagated through grains or along grain boundaries, and whether or not fatigue (cyclic stress) was the primary cause of failure. In addition, SEM-EDS testing can be used to identify the involvement of a specific ion or compound in a failure mechanism, through a combination of fracture surface analysis and chemical analysis.






Most water-bearing tubes used in boiler construction are fabricated from low-carbon steel. However, steam-bearing (superheater and reheater) tubes are commonly fabricated from low-alloy steel containing differing levels of chromium and molybdenum. Chromium and molybdenum increase the oxidation and creep resistance of the steel. For accurate assessment of metal overheating, it is important to have a portion of the tube analyzed for alloy chemistry. Alloy analysis can also confirm that the tubing is within specifications. In isolated instances, initial installation of the wrong alloy type or tube repairs using the wrong grade of steel can occur. In these cases, chemical analysis of the steel can be used to determine the cause of premature failure.

At times, it is necessary to estimate the mechanical properties of boiler components. Most often, this involves hardness measurement, which can be used to estimate the tensile strength of the steel. This is particularly useful in documenting the deterioration of mechanical properties that occurs during metal overheating. Usually, a Rockwell hardness tester is used; however, it is sometimes advantageous to use a microhardness tester. For example, microhardness measurements can be used to obtain a hardness profile across a welded zone to assess the potential for brittle cracking in the heat-affected zone of a weld.





Microstructural analysis of a metal component is probably the most important tool in conducting a failure analysis investigation. This testing, called metallography is useful in determining the following:

·         whether a tube failed from short-term or long-term overheating damage

·         whether cracks initiated on a waterside or fireside surface

·         whether cracks were caused by creep damage, corrosion fatigue, or stress-corrosion cracking (SCC)

·         whether tube failure resulted from hydrogen damage or internal corrosion gouging

Proper sample orientation and preparation are critical aspects of microstructural analysis. The orientation of the sectioning is determined by the specific failure characteristics of the case. After careful selection, metal specimens are cut with a power hacksaw or an abrasive cut-off wheel and mounted in a mold with resin or plastic. After mounting, the samples are subjected to a series of grinding and polishing steps. The goal is to obtain a flat, scratch-free surface of metal in the zone of interest. After processing, a suitable etchant is applied to the polished metal surface to reveal microstructural constituents (grain boundaries, distribution and morphology of iron carbides, etc.)

Metallographic analysis of the mounted, polished, and etched sections of metal is performed with a reflective optical microscope (Figure 14-14). This is followed by a comparison of microstructures observed in various areas of a tube section-for example, the heated side versus the unheated side of a waterwall tube. Because the microstructure on the unheated side often reflects the as-manufactured condition of the steel, comparison with the microstructure in a failed region can provide valuable insight into the degree and extent of localized deterioration.







Metropolitan Engineering, Consulting & Forensics (MECF)
Providing Competent, Expert and Objective Investigative Engineering and Consulting Services
P.O. Box 520
Tenafly, NJ 07670-0520
Tel.: (973) 897-8162
Fax: (973) 810-0440
E-mail: metroforensics@gmail.com
Web pages: https://sites.google.com/site/metropolitanforensics/
https://sites.google.com/site/metropolitanenvironmental/
https://sites.google.com/site/metroforensics3/
We are happy to announce the launch of our twitter account. Please make sure to follow us at @MetropForensics or @metroforensics1
Metropolitan appreciates your business.
Feel free to recommend our services to your friends and colleagues.