Page 46 - Chip Scale Review_May June_2023-digital
P. 46
Implementation
The training set is a critical part of the entire method, and
defects at this level will directly reflect on the test accuracy of
the shmoo results analysis. For now, there are about 650 shmoo
diagrams in the training set, including five categories: pass/good,
hole, voltage walls, frequency walls and marginal. All samples
are chosen from real shmoo testing projects with scaling to an
11x11 dot matrix. Because the shmoo plot size is not very large,
the proposed method was implemented using PyTorch 1.8.0, and
trained and tested on a computer with an Intel Core i7-10810U
central processing unit (CPU), and without a graphics processing
unit (GPU). The format of the shmoo samples in the training
dataset is shown in Figure 4. There are two parts in each training
sample: the result information and the shmoo diagram. In this
experiment, the X-axis represents the voltage, while the Y-axis is
the period.
In addition to the pass/fail indicators, there are four additional
indicators used to describe the detailed shmoo results:
1. Vol-Wall result is used for the voltage-wall shmoo, where
the count of the pass points changes dramatically along the
X-axis (voltage).
2. Freq-Wall result is for the frequency-wall shmoo, where
the count of the pass points changes dramatically along
the Y-axis (period).
3. Marginal indicator is used to show the shmoo plot where
the fail point is near, or appears at the central position.
4. Hole result indicates a hole defect meaning there are fail
test points surrounded by passing ones.
Results
One hundred samples were chosen from the 650-sample
training set randomly as training inputs, and then a second
set of 100 samples were used as the test set. The loss
function uses the MultiLabelSoftMarginLoss function,
which is commonly used in applications with a multi-label
classification [6]. The learning rate is set to 0.0014. The
gradient algorithm optimizer uses Adam, and the weight
decay is set to 0.0004 (to reduce possible training problems)
[7-9]. After 100 epochs, the test accuracy rate (pass/fail) was
approximately 0.97 (blue dots in Figure 5), while the test
accuracy rate (multi-labels) was approximately 0.89 (red dots
in Figure 5).
In the pursuit of high accuracy and low overfitting,
we found that the stability of this network structure was
strongly correlated with the convolutional layer (Figure
6a). If any of the middle convolutional layers are removed
from this model (Figure 6b), it leads to an accuracy drop
at epoch 28 under the same training conditions. The rate
decreases by approximately 3% and there are obvious
accuracy fluctuations during training. It can be seen that
“depth” is the key to ensuring this neural network model
meets expectations. The red dots in Figure 6 represent the
accuracy of the test with the full values, and the blue dots
represent the accuracy of the second-class test that only
considers pass/fail results.
CNN visualization
Because deep learning is based on the back propagation
algorithm to calculate and update the parameters of each
44 Chip Scale Review May • June • 2023 [ChipScaleReview.com]
44