Page 46 - Chip Scale Review_May June_2023-digital
P. 46

Implementation
                                                               The training set is a critical part of the entire method, and
                                                             defects at this level will directly reflect on the test accuracy of
                                                             the shmoo results analysis. For now, there are about 650 shmoo
                                                             diagrams in the training set, including five categories: pass/good,
                                                             hole, voltage walls, frequency walls and marginal. All samples
                                                             are chosen from real shmoo testing projects with scaling to an
                                                             11x11 dot matrix. Because the shmoo plot size is not very large,
                                                             the proposed method was implemented using PyTorch 1.8.0, and
                                                             trained and tested on a computer with an Intel Core i7-10810U
                                                             central processing unit (CPU), and without a graphics processing
                                                             unit (GPU). The format of the shmoo samples in the training
                                                             dataset is shown in Figure 4. There are two parts in each training
                                                             sample: the result information and the shmoo diagram. In this
                                                             experiment, the X-axis represents the voltage, while the Y-axis is
                                                             the period.
                                                               In addition to the pass/fail indicators, there are four additional
                                                             indicators used to describe the detailed shmoo results:

                                                              1.  Vol-Wall result is used for the voltage-wall shmoo, where
                                                                 the count of the pass points changes dramatically along the
                                                                 X-axis (voltage).
                                                              2. Freq-Wall result is for the frequency-wall shmoo, where
                                                                 the count of the pass points changes dramatically along
                                                                 the Y-axis (period).
                                                              3. Marginal indicator is used to show the shmoo plot where
                                                                 the fail point is near, or appears at the central position.
                                                              4. Hole result indicates a hole defect meaning there are fail
                                                                 test points surrounded by passing ones.

                                                             Results
                                                               One hundred samples were chosen from the 650-sample
                                                             training set randomly as training inputs, and then a second
                                                             set  of  100  samples  were  used  as  the  test  set.  The  loss
                                                             function  uses  the  MultiLabelSoftMarginLoss function,
                                                             which is commonly used in applications with a multi-label
                                                             classification [6]. The learning rate is set to 0.0014. The
                                                             gradient algorithm optimizer uses Adam, and the weight
                                                             decay is set to 0.0004 (to reduce possible training problems)
                                                             [7-9]. After 100 epochs, the test accuracy rate (pass/fail) was
                                                             approximately 0.97 (blue dots in Figure 5), while the test
                                                             accuracy rate (multi-labels) was approximately 0.89 (red dots
                                                             in Figure 5).
                                                               In  the  pursuit  of  high  accuracy  and  low  overfitting,
                                                             we found that the stability of this network structure was
                                                             strongly correlated with the convolutional layer (Figure
                                                             6a). If any of the middle convolutional layers are removed
                                                             from this model (Figure 6b), it leads to an accuracy drop
                                                             at epoch 28 under the same training conditions. The rate
                                                             decreases  by  approximately  3%  and  there  are  obvious
                                                             accuracy fluctuations during training. It can be seen that
                                                             “depth” is the key to ensuring this neural network model
                                                             meets expectations. The red dots in Figure 6 represent the
                                                             accuracy of the test with the full values, and the blue dots
                                                             represent the accuracy of the second-class test that only
                                                             considers pass/fail results.
                                                             CNN visualization
                                                               Because deep learning is based on the back propagation
                                                             algorithm to calculate and update the parameters of each


        44   Chip Scale Review   May  •  June  •  2023   [ChipScaleReview.com]
        44
   41   42   43   44   45   46   47   48   49   50   51