Yield Prediction & Optimization

Yield Prediction Models

Regression, ensemble methods, spatial models, and handling limited data

Modeling Approaches for Yield Prediction

Modeling Approaches for Yield Prediction

Different modeling strategies serve different yield prediction needs:

  • Wafer-level regression: Predict overall wafer yield from process parameters using XGBoost, Random Forest, or neural networks. Best for identifying process factors that drive yield variation.
  • Die-level prediction: Predict pass/fail or bin for individual dies. Much more data points but requires die-level features (design, spatial position, nearby metrology). Logistic regression, gradient boosting.
  • Spatial models: Account for across-wafer variation patterns. Gaussian Process models capture spatial correlations. CNN-based models treat the wafer map as an image.
  • Virtual WAT: Predict WAT electrical parameters from inline metrology and FDC data — faster than waiting for actual WAT measurements.

Key Concept: Feature Importance for Yield

The most valuable output of yield models is often not the prediction itself but the feature importance ranking. Knowing which process parameters most strongly influence yield directs engineering attention to the highest-impact improvements. SHAP values provide interpretable per-wafer explanations.

Handling Small Datasets

Handling Small Datasets

Semiconductor yield data has unique challenges:

  • Small n, large p: Hundreds of process parameters (features) but often only hundreds or thousands of wafers with yield data. Regularization is essential.
  • Non-stationary: Process conditions change over time (maintenance, recipe updates, material lot changes), so historical data may not represent current conditions.
  • Censored data: Wafers scrapped mid-process never get final yield data.

Strategies for small datasets:

  • Strong regularization (Lasso, Ridge, ElasticNet) to prevent overfitting
  • Bayesian methods that incorporate prior knowledge
  • Transfer learning from similar products or process nodes
  • Physics-informed features that encode domain knowledge
  • Cross-validation with time-aware splits (no data leakage from future)

Knowledge Check

Knowledge Check

1 / 2

What is often the most valuable output of yield prediction models?