# T2R2 東京科学大学 リサーチリポジトリ Science Tokyo Research Repository

## 論文 / 著書情報 Article / Book Information

| Title             | On-die parameter extraction from path-delay measurements                                                                                                                                                                                                                                                                                                      |  |  |
|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| Author            | Tomoyuki Takahashi, Takumi Uezono, Michihiro Shintani, Kazuya<br>Masu, Takashi Sato                                                                                                                                                                                                                                                                           |  |  |
| Journal/Book name | 2009 IEEE Asian Solid-State Circuits Conference, , , pp. 101 - 104                                                                                                                                                                                                                                                                                            |  |  |
| 発行日 / Issue date  | 2009, 11                                                                                                                                                                                                                                                                                                                                                      |  |  |
| 権利情報 / Copyright  | (c)2009 IEEE. Personal use of this material is permitted. However,<br>permission to reprint/republish this material for advertising or<br>promotional purposes or for creating new collective works for resale or<br>redistribution to servers or lists, or to reuse any copyrighted component<br>of this work in other works must be obtained from the IEEE. |  |  |

### On-die parameter extraction from path-delay measurements

Tomoyuki Takahashi, Takumi Uezono, Michihiro Shintani\*, Kazuya Masu, and Takashi Sato<sup>†</sup>

Integrated Research Institute, Tokyo Institute of Technology

\*Semiconductor Technology Academic Research Center

<sup>†</sup>Department of Communications and Computer Engineering, Kyoto University

Abstract—Device-parameter estimation through path-delay measurement, which facilitates fast on-die performance prediction and diagnosis, is proposed. With the proposed technique, delays of a set of paths consisting of different logic cells are monitored. Based on the pre-characterized parameter to delay sensitivity, the process variation of a chip is estimated as an inverse problem. Discussion of desirable logic cell combination to form paths that maximize estimation accuracy is presented. Measurement of ring oscillator arrays composed of standard and customized logic cells resulted in consistent estimation of threshold voltages. Measurement accuracy is greatly enhanced by the proposed good logic cell combinations.

#### I. INTRODUCTION

With the progress of technology scaling, process variation is beginning to significantly impact on product performances such as maximum operation frequency, power consumption, etc. The variation of the number of dopant atoms in the devicechannel region and the thickness variations of the gate oxide are unignorable in designs using recent technologies. This results in threshold voltage mismatch between transistors on a chip and delay variation of logic gates.

In order to improve yield and reliability, several methods have been proposed. Post-fabrication tuning of substrate bias is proposed [1] trying to compensate threshold-voltage variation. The adaptive test technique, which alters test-paths chip by chip, has been proposed to evaluate path delays accurately [2], [3]. Those techniques are based on the assumption that accurate device parameters of each chip are readily available. Traditionally, the device parameters are obtained through I-V curve measurement [4]. It provides us with sufficiently reliable information for device modeling. However, it requires analog voltage and current measurements, which can only be measurable by the special equipments outside of a chip. For simple and fast characterization of AC characteristics of digital CMOS circuits, ring oscillators are often used [5]. Ring oscillators can be easily embedded in a product chip to diagnose chip performance at wafer-test time or even after product shipment. Change of the measured oscillation frequency indicates process variation or temperature change. However, what the frequency of a ring oscillator tells us is that the chip can operate how fast or slow. It has been impossible to know exact parameter variations by a ring oscillator measurement alone.

In this paper, we propose an estimation method of the device parameters, such as threshold voltages or device-channel lengths, which have dominant influence on delay fluctuation. Path-delay measurement of different logic cells with their precharacterized delay sensitivity to individual process parameters enables reverse calculation of process variations. Using logic cells that have different delay-sensitivity to parameter variations, parameter extraction becomes possible. We first describe parameter estimation framework, then discuss design guideline including the choice of logic cell combination. We then propose a special cell that is suitable for parameter extraction. Test-chip measurement results of a 65-nm process will be also presented.

## II. PARAMETER ESTIMATION THROUGH PATH-DELAY MEASUREMENTS

A. Key concept

Path delay  $t_d$  is expressed using the following first order canonical form [6].

$$t_{d} = td_{typ} + \sum_{i=1}^{n} s_{i}\Delta p_{i} + N(0, \sigma_{rnd}^{2})$$
(1)

Here,  $td_{typ}$  is the nominal delay,  $\Delta p_i$  and  $s_i$  represents the global variation of an *i*-th parameter  $p_i$  and its sensitivity, and the random component  $N(0, \sigma_{rnd}^2)$ . This form is often an output of statistical static timing analyzers (SSTA).

The aim of our work is to estimate  $\Delta p_i$  from measured path-delays. By measuring two or more number of paths that have different sensitivities to the device parameters, we can estimate parameters as follows.

$$\Delta P = S^{-1} \cdot \Delta T_D \tag{2}$$

In this equation, the random component in Eq.(1) is ignored. This situation can be achieved either a path is composed of a large number of logic cells, or large number of paths are used for averaging. As a realistic example, when nMOS and pMOS threshold voltages are the device parameters of interest, Eq.(2) becomes

$$\begin{bmatrix} \Delta v_{tn} \\ \Delta v_{tp} \end{bmatrix} = \begin{bmatrix} S_{vtn1} & S_{vtp1} \\ S_{vtn2} & S_{vtp2} \end{bmatrix}^{-1} \begin{bmatrix} td_1 - td_{typ1} \\ td_2 - td_{typ2} \end{bmatrix}.$$
 (3)

Here,  $S_{vtn1}, S_{vtp1}$  are the sensitivities of the first path to nMOS and pMOS threshold voltages, respectively.  $S_{vtn2}$  and  $S_{vtp2}$  are defined similarly. As long as the paths are chosen so that sensitivity matrix S becomes non-singular, we can estimate  $\Delta vtn$  and  $\Delta vtp$  by measuring two path delays. The sensitivity matrix can be obtained by circuit simulations, or from SSTA report. Here, delay-average of the path over all chips can be used as the typical delay  $td_{typ}$ .

This idea can be extended to quadratic delay model to improve accuracy for wider variation range.

$$td = td_{typ} + S'_{vtn} \cdot \Delta v_{tn}^2 + S_{vtn} \cdot \Delta v_{tn} + S'_{vtp} \cdot \Delta v_{tp}^2 + S_{vtp} \cdot \Delta v_{tp} + S_{vtnp} \cdot \Delta v_{tn} \cdot \Delta v_{tp} \quad (4)$$

The quadratic model improves estimation accuracy significantly. Resultant simultaneous equation does not have closedform solution thus solving them for threshold voltages involve numerical calculation. In the similar manner, the estimation of three parameters such as adding channel-length variation  $\Delta L$ becomes possible.

#### B. Choice of logic cell combination



Fig. 1. Response surface of 20-stages inverter delay.

In order to realize parameter estimation, we first need to calculate delay-sensitivities of each parameter. Figure 1 shows the response surface of delay as function of threshold voltages  $\Delta vtn$  and  $\Delta vtp$  for 20-stages inverter chain. Even for this limited range of parameter variation, it is necessary to use quadratic delay model due to its non-linearity.

Now, we discuss a good choice of logic cell combination and draw guidelines to achieve accurate parameter estimation. We here consider five standard logic cells, inv, nand2, nand4, nor2, and nor3, and two special cells, nnmos and ppmos, as the path-component. The nnmos (ppmos) is newly designed so that it has stronger sensitivity to threshold voltage of nMOS (pMOS) than other logic cells. The detailed composition of the cell is described in Sec. III-B. Figure 2 shows the sensitivity vectors  $\vec{s}$  of seven ring oscillators composed of each cell. The sensitivity vector is defined as a first derivative of delayresponse surface at the origin. In other words, components of  $\vec{s}$  are coefficients of  $\Delta v_{tn}$  and  $\Delta v_{tp}$  in Eq. (4).

Each logic cell has similar but different sensitivity to nMOS and pMOS threshold voltages, thus the choice of the cell is important to attain accuracy. In general, the estimation error is well described by the *condition number* of sensitivity matrix.

$$cond = \|S\|_{\infty} \cdot \|S^{-1}\|_{\infty} \tag{5}$$

Here,  $||S||_{\infty}$  is infinity norm of matrix S. When the angle formed by any pair of sensitivity vectors increases, condition number becomes large. The error is also affected by the maximum  $\Delta td$ , i.e. the length of the sensitivity vector. In order to make normalized comparison, the number of stages of each path is adjusted to have the same maximum  $\Delta td$ . Figure 3 shows the estimation error of the  $\Delta vtn$  and  $\Delta vtp$ with different cell combinations. The parameters are estimated by solving quadratic delay-equations numerically for 25 points ( $\Delta vtn$  and  $\Delta vtp$  are changed by 20 mV step from -40 mV to 40 mV). An error of  $\pm 31.25$  psec is added to the delay to represent timing error of a tester. It is shown that the estimation error becomes small as the condition number becomes small.







Fig. 3. Estimation accuracy vs. condition number of sensitivity matrix.

Similarly, the larger maximum  $\Delta td$  improves estimation accuracy for a constant delay-measurement error. It is concluded that we should choose cell combination so that the condition number of sensitivity matrix becomes small.

#### III. TEST CHIP MEASUREMENT

#### A. Circuit structure

As a proof of concept, the ring oscillator array (ROA) is designed. The implemented ROA consists of 512 ring oscillators having different logic-cell types and different number of stages. Each ring oscillator is composed of a single type of logic cell. Table I shows cell types in the ROA. Number of ring oscillators included in an ROA is also listed. All ring oscillators have four different stages — 21, 41, 81, and 161. Output of each ring oscillator is locally divided by  $2^4$ ,  $2^3$ ,  $2^2$ , and  $2^1$ , respectively. Output frequency becomes almost the same regardless of the number of stages, which makes supply voltage drop almost equal.

TABLE I LOGIC CELLS USED FOR RING OSCILLATORS.

| cell name | number | cell name | number |
|-----------|--------|-----------|--------|
| nand2     | 16     | inverter  | 24     |
| nand4     | 8      | ppmos     | 8      |
| nor2      | 16     | nnmos     | 8      |
| nor3      | 8      | others    | 32     |
| mux2      | 8      |           |        |

A simplified block diagram of the ROA is illustrated in Fig. 4. A block of ring oscillator is selected using one-hot shift resister. Each block is composed of four ring oscillators having different number of stages and local frequency divider. One of the ring oscillators is chosen by the combination of shift register and the ring-length selection signal RLEN. Output of each ring oscillator is led to multiplexer tree to the output pad. The multiplexer tree and output buffer utilize dedicated power supply to avoid power supply noise for ring oscillators.



Fig. 4. A structure of a ring oscillator array.

#### B. Special logic cells

In addition to the standard logic cell, special cells dedicated for the parameter extraction are implemented. For the purpose of accurate estimation of parameters  $\Delta vtn$  and  $\Delta vtp$ , it is desirable to use the cells that have higher sensitivities to a parameter of interest. Considering the simulation result in the previous section, we newly devise cells that are particularly sensitive to either nMOS or pMOS threshold voltages. The schematics of the proposed cells are shown in Fig. 5. In a standard inverter cell, nMOS (pMOS) transistor is responsible for the discharging (charging) of the load capacitance. In contrast, the proposed cells form inverter by a pair of nMOS (pMOS). In order to restore signal level pull-up pMOS (pull-down nMOS) is connected. In the proposed cells, both charging and discharging time are mainly determined by either nMOS (pMOS) only, thus the cells are more sensitive to  $\Delta v t n$  $(\Delta vtp)$ . By using long-channel devices, it can minimize the effects of the channel length variation.

#### **IV. EXPERIMENTAL RESULTS**

The ROA is fabricated using 65-nm, 12-metal layer CMOS process. Figure 6 shows chip micrograph. The area of the circuit block of the ROA is 7.2 x  $210 \,\mu\text{m}^2$ . Total area of the circuit is 736 x  $873.6 \,\mu\text{m}^2$ . Two ROA circuits are placed on a chip for verification of the measurement result. The two ROA are located next to each other, i.e. the equivalent points of the two ROA are apart by 25 x  $880 \,\mu\text{m}$ .

Two ROA on 38 chips in a single wafer are measured. Threshold voltages of nMOS and pMOS transistors are estimated using measured frequencies of two ring oscillators



Fig. 5. The proposed custom cells (nnmos and ppmos) suitable for threshold voltage extraction.



Fig. 6. Chip micrograph of the two ROA.

composed of different cell types. Figure 7 shows frequency variation over the chips. We see that the oscillation frequency fluctuates more than 6% over the chips in a wafer. We also see that the frequencies are different depending on logic cells. Frequencies of different cell types are following similar trend, but a few chips behave differently. For example, nand4 and nor3 has different sensitivities because of its structural difference in types of stacked transistors. Inv and nor3 are slow in chip-2, which suggests the pMOS threshold voltage of the chip is higher. The threshold voltages of other chips can be examined more systematically by the proposed framework.



Fig. 7. Measured frequencies of ring oscillators composed of different chips and logic-cell types.

Figure 8 shows the effect of random variation component as functions of number of logic stages. As expected,  $\sigma/\mu$  is in proportion to  $1/\sqrt{n}$  where n is the number of ring oscillator stages. The influence of the random variation can be ignored in ring oscillators of 81 and 161-stages. Thus, the results of 161-stages are used in the following analysis.



Fig. 8. Vanishing random component as the increase of logic stages of ring oscillator.

Threshold voltages by two sets of cell combinations are extracted and compared in order to confirm consistency of our estimation methodology. The cell combinations used here are (nand4, nor3) and (nnmos, ppmos). Correlations of estimated results are depicted in Fig. 9. The extracted threshold voltages from different pairs of cell coincide very well. The correlation coefficients of the threshold voltages from two different cell sets are 0.728 and 0.767, respectively. It means that the proposed estimation framework works very well and the both pairs are good combination for parameter estimation.



Figure 10 compares difference of extracted threshold voltages ages between circuits A and B. If the threshold voltages calculated by ring oscillators in the two circuits are equal, difference becomes nearly zero indicating that the extraction is consistent. The averaged parameter difference over 38 chips is shown for four cell combinations. The difference of the estimation by using combinations of (nand4, nor3) and (nnmos, ppmos) are less than 5 mV, showing very good match. The combinations of (nand4, nnmos) and (nand2, nand4) cause large error because both cells are sensitive to  $\Delta vtn$  thus have large condition number. Normally, the estimation difference between two identical circuits has to be very small within a few mV, even if we consider possible error components such as systematic trend of a parameter within a chip, delay measurement error, approximation error of response surface



Fig. 10. Difference of vth estimation between circuits A and B.

function, and remaining random variation. The paths of good combination cells apparently give good approximation of the device parameters, as their difference between near-by circuits is very small. The consistency of the estimated result in the ROA circuit confirms that the proposed parameter extraction works well. It is also possible that redundant number of paths can be used to further improve estimation accuracy.

#### V. CONCLUSION

A parameter-estimation method suitable for adaptive circuit techniques is proposed. By comprehensive circuit simulations, good logic-cells to form delay-measurement path that yields accurate estimation is discussed. It has been clarified that paths of large delay, and cells combination having condition number as small as possible are important. Following to the above design guideline, a ring oscillator array composed of good path combinations for extracting threshold voltages of nMOS and pMOS transistors is designed. Measurement results showed that the consistency within a chip is less than 5 mV, and estimation accuracy will be greatly enhanced by the newly proposed cells that are particularly sensitive to either nMOS or pMOS threshold voltages.

#### ACKNOWLEDGEMENTS

This research was partially supported by STARC, Japanese Ministry of Economy, Trade and Industry sponsored "Next-Generation Circuit Architecture Technical Development" program, and VDEC.

#### REFERENCES

- [1] J. W. Tschanz, J. T. Kao, S. G. Narendra, R. Nair, D. A. Antoniadis, A. P. Chandrakasan, and V. De, "Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage," *IEEE JSSC*, vol. 37, no. 11, pp. 1396–1402, Nov. 2002.
- Nov. 2002.
  [2] N. Callegari, P. Bastani, L.-C. Wang, S. Chakravarty, and A. Tetelbaum, "Path selection for monitoring unexpected systematic timing effects," in *Proc. ASP-DAC*, 2009, pp. 781–786.
- [3] M. Shintani, T. Uezono, T. Takahashi, H. Ueyama, T. Sato, K. Hatayama, T. Aikyo, and K. Masu, "An adaptive test for parametric faults based on statistical timing information," in *Proc. ATS*, 2009.
- [4] S. Ohkawa, M. Aoki, and H. Masuda, "Analysis and characterization of device variations in an LSI chip using an integrated device matrix array," *IEEE Trans. Semicond. Manufact.*, vol. 17, no. 2, pp. 155–165, May 2005.
- [5] L. T. Pang and B. Nicolic, "Impact of layout on 90nm CMOS process parameter fluctuations," in VLSI Circuits Tech. Dig., 2006, pp. 69–70.
- [6] C. Visweswariah, K. Ravindran, K. Kalafala, S. G. Walker, and S. Narayan, "First-order incremental block-based statistical timing analysis," in *Proc. DAC*, 2004, pp. 331–336.