Virtual Metrology & Process Control

Adaptive Recipe Optimization

Bayesian optimization, DoE + ML, and multi-objective tuning for semiconductor recipes

Bayesian Optimization for Recipes

Bayesian Optimization for Recipes

Finding the optimal recipe for a semiconductor process is a classic expensive black-box optimization problem. Each experiment (running a wafer with a specific recipe) costs hundreds of dollars and takes hours. You need to find the best settings in as few experiments as possible.

Why Bayesian Optimization?

  • Sample efficient: Finds optima in 20–50 experiments vs. 200+ for grid search.
  • Handles noise: Metrology measurements are noisy; BO naturally accounts for uncertainty.
  • Balances exploration vs. exploitation: The acquisition function decides whether to try a new region (explore) or refine a promising area (exploit).
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import Matern
from scipy.optimize import minimize
from scipy.stats import norm

class RecipeOptimizer:
    """Bayesian optimization for semiconductor recipe tuning."""

    def __init__(self, param_bounds, target_value, minimize_variance=True):
        self.bounds = param_bounds  # Dict: {param: (low, high)}
        self.target = target_value
        self.minimize_variance = minimize_variance
        self.gp = GaussianProcessRegressor(
            kernel=Matern(nu=2.5),
            alpha=0.1,  # Metrology noise
            n_restarts_optimizer=5
        )
        self.X_observed = []
        self.y_observed = []

    def suggest_next_recipe(self):
        """Suggest the next recipe to try using Expected Improvement."""
        if len(self.X_observed) < 5:
            # Initial Latin Hypercube sampling
            return self._random_sample()

        X = np.array(self.X_observed)
        y = np.array(self.y_observed)

        # Fit GP surrogate model
        self.gp.fit(X, y)

        # Optimize acquisition function
        best_y = np.min(np.abs(y - self.target))

        def neg_expected_improvement(x):
            x = x.reshape(1, -1)
            mu, sigma = self.gp.predict(x, return_std=True)
            error = np.abs(mu[0] - self.target)
            improvement = best_y - error
            z = improvement / (sigma[0] + 1e-8)
            ei = improvement * norm.cdf(z) + sigma[0] * norm.pdf(z)
            return -ei

        bounds_list = list(self.bounds.values())
        best_x = None
        best_ei = float('inf')
        for _ in range(20):  # Multi-start
            x0 = self._random_sample()
            result = minimize(neg_expected_improvement, x0,
                            bounds=bounds_list, method='L-BFGS-B')
            if result.fun < best_ei:
                best_ei = result.fun
                best_x = result.x

        return best_x

    def record_result(self, recipe_params, metrology_result):
        self.X_observed.append(recipe_params)
        self.y_observed.append(metrology_result)

    def _random_sample(self):
        return np.array([
            np.random.uniform(lo, hi)
            for lo, hi in self.bounds.values()
        ])

# Usage:
optimizer = RecipeOptimizer(
    param_bounds={
        'etch_time': (30, 60),      # seconds
        'rf_power': (200, 400),     # watts
        'pressure': (3, 8),         # mTorr
        'gas_ratio': (0.5, 2.0),    # Cl2/BCl3
    },
    target_value=50.0  # Target etch depth in nm
)

Key Concept: Acquisition Functions

The acquisition function is the "brain" of Bayesian optimization. Expected Improvement (EI) is the most common: it balances trying high-potential regions (where the GP mean is good) with uncertain regions (where the GP variance is high). For recipe optimization, custom acquisition functions can also incorporate process constraints (e.g., "etch selectivity must be > 10:1").

DoE + ML Hybrid Approaches

DoE + ML Hybrid Approaches

Traditional Design of Experiments (DoE) and ML-based optimization are not competitors — they're complementary. The best recipe development workflows combine both.

The Hybrid Workflow

  1. Phase 1 — Screening DoE: Run a fractional factorial or Plackett-Burman design to identify which of the 10+ recipe parameters actually matter. This reduces the search space from 10D to 3–5D.
  2. Phase 2 — Response Surface DoE: Run a central composite or Box-Behnken design on the important factors to build an initial response surface model.
  3. Phase 3 — ML Refinement: Use the DoE data to seed a Bayesian optimization loop. The GP surrogate model starts with a good prior from the DoE, requiring fewer additional experiments to converge.
import numpy as np
import pandas as pd
from itertools import product

def generate_ccd(factors, levels=5):
    """Generate Central Composite Design for response surface modeling."""
    n = len(factors)
    alpha = np.sqrt(n)  # Star point distance

    # Factorial points (2^n)
    factorial = np.array(list(product([-1, 1], repeat=n)))

    # Star points (2n)
    star = np.zeros((2 * n, n))
    for i in range(n):
        star[2*i, i] = -alpha
        star[2*i+1, i] = alpha

    # Center points (typically 3-5 replicates)
    center = np.zeros((5, n))

    design_coded = np.vstack([factorial, star, center])

    # Convert to real units
    design_real = pd.DataFrame()
    for i, (name, (low, high)) in enumerate(factors.items()):
        mid = (low + high) / 2
        half_range = (high - low) / 2
        design_real[name] = mid + design_coded[:, i] * half_range

    return design_real

# Screening factors for an etch recipe
factors = {
    'etch_time': (30, 60),
    'rf_power': (200, 400),
    'pressure': (3, 8),
}

ccd = generate_ccd(factors)
print(f"CCD requires {len(ccd)} experiments for {len(factors)} factors")
print(ccd.head(10))

# After running experiments, train ML model on DoE data
# Then switch to Bayesian optimization for fine-tuning

Analogy: House Hunting

DoE is like surveying all neighborhoods in a city (broad coverage, structured). ML-based optimization is like deep-diving into the most promising neighborhoods (targeted, adaptive). The best strategy: survey first, then deep-dive where it matters.

Multi-Objective Recipe Optimization

Multi-Objective Recipe Optimization

Real recipe optimization is almost never single-objective. You're simultaneously trying to:

  • Hit target etch depth (minimize deviation from 50 nm)
  • Maximize etch selectivity (etch target material, not mask)
  • Minimize surface roughness
  • Minimize within-wafer non-uniformity
  • Maximize throughput (minimize etch time)

These objectives often conflict. The solution is a Pareto front: the set of recipes where no objective can be improved without worsening another.

import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor

class MultiObjectiveRecipeOptimizer:
    """Multi-objective Bayesian optimization using EHVI."""

    def __init__(self, param_bounds, objective_names, ref_point):
        self.bounds = param_bounds
        self.objectives = objective_names
        self.ref_point = ref_point  # Worst acceptable values
        self.gps = {obj: GaussianProcessRegressor() for obj in objective_names}
        self.X_observed = []
        self.Y_observed = {obj: [] for obj in objective_names}

    def is_pareto_optimal(self, costs):
        """Find Pareto-optimal points (assuming minimization)."""
        is_efficient = np.ones(costs.shape[0], dtype=bool)
        for i, c in enumerate(costs):
            if is_efficient[i]:
                # A point is dominated if another point is <= in all objectives
                # and < in at least one
                is_efficient[is_efficient] = np.any(
                    costs[is_efficient] < c, axis=1
                ) | np.all(costs[is_efficient] == c, axis=1)
                is_efficient[i] = True
        return is_efficient

    def get_pareto_front(self):
        """Return current Pareto-optimal recipes."""
        if len(self.X_observed) == 0:
            return [], []

        Y_matrix = np.column_stack([
            self.Y_observed[obj] for obj in self.objectives
        ])
        pareto_mask = self.is_pareto_optimal(Y_matrix)

        pareto_X = np.array(self.X_observed)[pareto_mask]
        pareto_Y = Y_matrix[pareto_mask]
        return pareto_X, pareto_Y

# Example: etch recipe with competing objectives
optimizer = MultiObjectiveRecipeOptimizer(
    param_bounds={'etch_time': (30, 60), 'rf_power': (200, 400)},
    objective_names=['depth_error', 'roughness', 'nonuniformity'],
    ref_point=[5.0, 2.0, 10.0]  # Worst acceptable values
)

# The engineer picks from the Pareto front based on which
# trade-off is most acceptable for their specific application

Did You Know?

Leading equipment companies like Applied Materials and Lam Research now embed ML-based recipe optimization directly into their equipment software. Instead of process engineers manually running DoEs over weeks, the equipment autonomously explores the recipe space and converges on optimal settings — reducing recipe development time from weeks to days.

Knowledge Check

Knowledge Check

1 / 3

Why is Bayesian optimization well-suited for semiconductor recipe tuning?