Classification & Root Cause
Multi-class classification, spatial signatures, and equipment fingerprinting
ML for Root Cause Analysis
ML for Root Cause Analysis
Beyond detecting defects, ML helps identify what caused them:
- Equipment fingerprinting: Each process chamber leaves subtle "signatures" on wafers. ML models can identify which specific chamber processed a wafer based on defect patterns or metrology signatures — essential for isolating problematic equipment.
- Correlation analysis: Linking defect occurrences to upstream process parameters. Random Forest feature importance or SHAP values reveal which equipment parameters most strongly predict defects.
- Temporal analysis: Tracking defect rate trends after PM events, recipe changes, or chemical lot changes to identify root causes.
- Spatial signature matching: Comparing wafer-level defect patterns against a library of known signatures. Each root cause (reticle defect, chuck contamination, edge ring wear) produces a characteristic spatial pattern.
Analogy: Forensic Investigation
Defect root cause analysis is like crime scene investigation. Each piece of evidence (defect location, type, timing, equipment history) narrows the suspect list. ML automates the detective work, correlating thousands of variables to find the culprit faster than any human could.
Commonality Analysis and SHAP-Based Diagnosis
Commonality Analysis and SHAP-Based Diagnosis
The first question an FA engineer asks when yield drops is "what do all the bad wafers have in common?" — and the second is "which sensor moved?" Two ML-friendly techniques cover both.
1. Commonality analysis
For each candidate factor (chamber, recipe, photoresist lot, operator shift), compute the defect rate among wafers that touched that factor vs. those that didn't. A Fisher exact test or χ² ranks the most likely culprit:
import pandas as pd
from scipy.stats import fisher_exact
def commonality_rank(df: pd.DataFrame, defect_col="is_killer",
factor_cols=("etch_chamber", "litho_chamber", "resist_lot", "shift")):
"""Rank factors by association strength with defect occurrence (Fisher p-value)."""
rows = []
total_bad = df[defect_col].sum()
total = len(df)
for f in factor_cols:
for value, group in df.groupby(f):
a = group[defect_col].sum()
b = len(group) - a
c = total_bad - a
d = (total - len(group)) - c
_, p = fisher_exact([[a, b], [c, d]], alternative="greater")
rows.append({"factor": f, "value": value, "p_value": p,
"bad": int(a), "total": len(group)})
return pd.DataFrame(rows).sort_values("p_value").head(20)
2. SHAP for parametric root cause
When the suspect is a sensor (not a categorical factor), train a gradient-boosted classifier that predicts pass/fail from upstream FDC summary stats and inspect SHAP values:
import shap
from xgboost import XGBClassifier
model = XGBClassifier(n_estimators=300, max_depth=4, learning_rate=0.05)
model.fit(X_train, y_train)
# Per-wafer attribution: which features push prediction toward "fail"?
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
# Top global drivers across the lot
import numpy as np
mean_abs_shap = np.abs(shap_values).mean(axis=0)
top10 = np.argsort(mean_abs_shap)[-10:]
for i in reversed(top10):
print(f"{X_test.columns[i]:30s} |SHAP|={mean_abs_shap[i]:.4f}")
Layering the two
Commonality analysis narrows the search to a chamber or a lot; SHAP on that subset pinpoints which sensor signature is doing the damage. Together they cut root-cause-investigation time from days to hours.
Key Concept: Correlation ≠ Cause
SHAP says which features the model relies on, not what's physically causal. Always validate a candidate root cause with a designed change (recipe split, chamber swap, or fresh PM) before declaring victory — otherwise you'll spend a quarter chasing a confounder.
Knowledge Check
Knowledge Check
1 / 2What is equipment fingerprinting in defect analysis?