

#### **A Dual Dielectric Approach for Performance Aware Gate Tunneling Reduction in Combinational Circuits**

#### V. Mukherjee, S. P. Mohanty, E. Kougianos University of North Texas, Denton, TX 76203. Email: smohanty@cs.unt.edu Homepage: <a href="http://www.cs.unt.edu/~smohanty/">http://www.cs.unt.edu/~smohanty/</a>

10/4/2005 CCD 2005





#### Outline of the Talk

- Introduction
- Why Dual Dielectric
- Related Works
- DKDT Assignment Algorithm
- Cell Characterization for High-K
- Conclusions





#### Why Low-Power ?

#### Major Motivation: Extending battery life





Source: Power Integrations Inc







#### **Power Dissipation Trend**





#### Why Dual-K and Dual-T?

 Gate oxide tunneling current I<sub>gate</sub> [Kim2003, Chandrakasan2001] (k is a experimentally derived factors):

 $I_{gate} \alpha (V_{dd} / T_{gate})^2 exp (-k T_{gate} / V_{dd})$ 

- Options for reduction of tunneling current:
  - Decreasing of supply voltage V<sub>dd</sub> (*will play its role*)
  - Increasing gate SiO<sub>2</sub> thickness T<sub>gate</sub> (opposed to the technology trend !!)





#### Why Dual-K and Dual-T?

# Use of multiple dielectrics (denoted as $K_{gate}$ ) of multiple thickness (denoted as $T_{gate}$ ) will reduce the gate tunneling current significantly while maintaining the performance.





# Why Dual-K and Dual-T? (Low K<sub>gate</sub> Vs High K<sub>gate</sub>)







# Why Dual-K and Dual-T ? (Low $T_{gate}$ Vs High $T_{gate}$ )







Why Dual-K and Dual-T? (Example: Four Types of Inverter)

**Assumption**: all transistors of a logic gate are of same  $K_{gate}$  and equal  $T_{gate}$ .





- Silicon Oxynitride (SiO<sub>x</sub>N<sub>y</sub>) (K=5.7 for SiON)
- Silicon Nitride (Si<sub>3</sub>N<sub>4</sub>) (K=7)
- Oxides of :
  - Aluminum (AI), Titanium (Ti), Zirconium (Zr), Hafnium (Hf), Lanthanum (La), Yttrium (Y), Praseodymium (Pr),
  - their mixed oxides with SiO<sub>2</sub> and Al<sub>2</sub>O<sub>3</sub>
- NOTE:  $I_{gate}$  is still dependent on  $T_{gate}$  irrespective of dielectric material.

10/4/2005 ICCD 2005





#### Related Works: Gate Leakage Reduction

- Inukai et. al. CICC2000: Boosted Gate MOS device using dual  $T_{\rm ox}$  and dual  $V_{\rm Th}$
- Rao et. al. ESSCIRC2003: Sleep state assignment for MTCMOS circuits
- Lee et. al. DAC2003 and TVLSI2004Feb: Pin reordering to minimize gate leakage during standby positions
- Sultania et. al. DAC2004 and ICCD2004: Heuristic for dual T<sub>ox</sub> assignment
- Sirisantana et. al. IEEEDTC2004Feb and ICCD2000: Use multiple channel lengths and multiple gate oxide thickness





#### Related Works : Summary

- Developed methods that use oxide of different thicknesses for tunneling reduction.
- Do not handle emerging dielectrics that will replace SiO<sub>2</sub> to reduce the tunneling current.
- Either consider ON or OFF state, but do not account both.
- Degradation in performance due to dual thickness approach.





#### Key Contributions of Our Work

- A new approach called dual dielectric assignment for tunneling current reduction.
- Considers dual thickness approach for both of the dielectrics.
- Explores a combined approach called DKDT (Dual-K of Dual Thickness) and proposes an assignment algorithm.
- Accounts both ON and OFF state gate tunneling.
- Logic cell characterization for average tunneling considering non-SiO<sub>2</sub> dielectrics











### **Optimization Problem Definition**

Given a weighted directed acyclic graph G(V,E) it is required to find the best possible assignment of dielectric and thickness such that the total tunneling current is minimized and latency constraint (circuit performance) is satisfied.

**Optimization Problem:** 

$$\begin{array}{lll} Minimize & \sum_{v_i \in V} I_{gate}(v_i) \\ \\ \text{ such that, } & \sum_{v_i \in V_P} D_i(v_i) & \leq & D_{CP} \end{array}$$

10/4/2005 ICCD 2005





#### DKDT Assignment : Basis

- **Observation**: Tunneling current of logic gates increases and propagation delay decreases in the order  $K_2T_2$ ,  $K_2T_1$ ,  $K_1T_2$ , and  $K_1T_1$  (where,  $K_1 < K_2$  and  $T_1 < T_2$ ).
- Strategy: Assign a higher order K and T to a logic gate under consideration
  - To reduce tunneling current
  - Provided increase in path-delay does not violate the target delay





## DKDT Assignment : Algorithm

- Step 1: Represent the network as a directed acyclic graph G(V, E).
- **Step 2:** Initialize each vertex  $v \in G(V, E)$  with the values of tunneling current and delay for  $K_1T_1$  assignment.
- **Step 3:** Find the set of all paths  $P{\Pi_{in}}$  for all vertex in the set of primary inputs  $(\Pi_{in})$ , leading to the primary outputs  $\Pi_{out}$ .
- **Step 4:** Compute the delay  $D_P$  for each path  $p \in P{\Pi_{in}}$ .

10/4/2005 ICCD 2005





#### DKDT Assignment : Algorithm

**Step 5:** Find the critical path delay  $D_{CP}$  for  $K_1T_1$  assignment.

**Step 6:** Mark the critical path(s)  $P_{CP}$ , where  $P_{CP}$  is subset  $P{\Pi_{in}}$ .

**Step 7:** Assign target delay  $D_T = D_{CP}$ .

**Step 8:** Traverse each node in the network and attempt to assign K-T in the order  $K_2T_2$ ,  $K_2T_1$ ,  $K_1T_2$ , and  $K_1T_1$  to reduce tunneling while maintaining performance.

10/4/2005 ICCD 2005



KDT Assignment: Step8 Heuristic

```
(1) FOR each vertex v \in G(V, E)
(2) {
        (1) Determine all paths P_v to which node v belongs;
        (2) Assign K_2T_2 to v; Carry out Local Fanout Optimization (LFO); Determine
           timing closure and insert buffers (TCD/BI);
        (3) Calculate new critical delay D<sub>CP</sub>;
        (4) Calculate slack in delay as \Delta D = D_{T} - D_{CP};
        (5) IF ( \triangle D < 0) then
        (6) {
                (1) Assign K_2T_1 to v; LFO; TCD/BI; Calculate D_{CP}; Calculate \Delta D;
                (2) IF (\Delta \overline{D} < \overline{0}) then
                (3) {
                        (1) Assign K_1T_2 to v; LFO; TCD/BI; Calculate D_{CP}; Calcu \Delta D;
                        (2) IF (\Delta D < 0) then
                               (1) reassign K_1T_1 to v;
                (4) } // end IF
       (7) } // end IF
(3) // end FOR
```





## Logic Cell Characterization : Load

- The Berkeley Predictive Technology Model (BPTM) has been used.
- The first step in the characterization was the selection of an appropriate capacitive load ( $C_{Load} = 10 * C_{ggPMOS}$  used).
- The supply voltage is held at  $V_{DD} = 0.7V$ .
- We define the delay as the time difference between the 50% level of input and output.





#### Logic Cell Characterization : t<sub>r</sub>

- For worst-case scenarios in the development of the algorithm, we chose the maximum delay time [ i. e. maximum (t<sub>pdr</sub>, t<sub>pdf</sub>) ].
- The effect of switching pulse rise time t<sub>r</sub> was initially examined on the delay characteristics.
- To eliminate an explicit dependence of the algorithm results on t<sub>r</sub>, we chose a value that is realistic yet does not affect the delay significantly.







#### **Delay Versus Rise Time**





 The effect of varying dielectric material was modeled by calculating an equivalent oxide thickness (T<sup>\*</sup><sub>ox</sub>) according to the formula:

$$T^*_{ox} = (K_{gate} / K_{ox}) T_{gate}$$

Here, K<sub>gate</sub> is the dielectric constant of the gate dielectric material other than SiO<sub>2</sub>, (of thickness T<sub>gate</sub>), while K<sub>ox</sub> is the dielectric constant of SiO<sub>2</sub>.





- The effect of varying oxide thickness T<sub>ox</sub> was incorporated by varying TOXE in SPICE model.
- Length of the device is proportionately changed to minimize the impact of higher dielectric thickness on the device performance :

 $L^* = (T^*_{ox} / T_{ox}) L$ 

• Length and width of the transistors are chosen to maintain (W:L) ratio of (4:1) for NMOS and (8:1) for PMOS.

10/4/2005 ICCD 2005





# Logic Cell Characterization : I<sub>gate</sub>



**BSIM4 Model** 

Calculated by evaluating both the source and drain components

For a MOS,  $I_{gate} = (|I_{gs}| + |I_{gd}| + |I_{gcs}| + |I_{gcd}| + |I_{gb}|)$ 

Values of individual components depends on states, ON or OFF





#### Logic Cell Characterization : An Inverter

- Low Input : Input supply feeds the tunneling current.
- High Input : Gate supply feeds the tunneling current.









10/4/2005 ICCD 2005



#### II Characterization: 45nm Tech Cells





#### Experimental Results: Setup

- DKDT algorithm integrated with SIS, and tested on the ISCAS'85 benchmarks.
- Used  $K_1 = 3.9$  (for SiO<sub>2</sub>),  $K_2 = 5.7$  (for SiON),  $T_1 = 1.4$ nm, and  $T_2 = 1.7$ nm for our experiments.
- $T_1$  is chosen as the default value from the BSIM4.4.0 model card and value of  $T_2$  is intuitively chosen





#### **Experimental Results : Table**

| Bench.<br>Circuits | Gates | Critical<br>Delay (ps) | Current for<br>K <sub>1</sub> T <sub>1</sub> (nA) | Current for<br>DKDT (nA) | %Reduc<br>tion |
|--------------------|-------|------------------------|---------------------------------------------------|--------------------------|----------------|
| C432               | 160   | 3.848                  | 3949.45                                           | 253.26                   | 93.58          |
| C499               | 202   | 2.054                  | 5708.55                                           | 590.45                   | 89.66          |
| C880               | 383   | 6.162                  | 6537.02                                           | 337.84                   | 94.83          |
| C1355              | 546   | 2.054                  | 5708.55                                           | 274.644                  | 95.19          |
| C1908              | 880   | 6.675                  | 9714.74                                           | 287.72                   | 97.04          |
| C2670              | 1193  | 24.64                  | 17863.33                                          | 1560.67                  | 91.27          |
| C3540              | 1669  | 18.23                  | 34637.15                                          | 2215.74                  | 93.60          |
| C5315              | 2406  | 23.10                  | 28156.87                                          | 1098.80                  | 96.10          |
| C6288              | 2406  | 24.89                  | 28474.64                                          | 372.56                   | 98.69          |
| C7552              | 3512  | 26.44                  | 33899.46                                          | 625.84                   | 98.15          |





#### Experimental Results ....

#### **Tunneling Current and % Reduction**







#### **Benchmark Circuits**

**NOTE**: DKDT has not time penalty.

10/4/2005 ICCD 2005





#### **Conclusions and Future Works**

- New approach for tunneling current reduction accounting for both ON and OFF states.
- Algorithm could perform the such assignment for circuits in reasonable amount of time.
- Experiments prove significant reductions in tunneling current without performance penalty.





#### **Conclusions and Future Works**

- Modeling for other high-K dielectrics is under progress.
- Development of optimal assignment algorithm can be considered.
- Tradeoff of tunneling, area and performance needs to be explored.
- DKDT based design may need more masks for the lithographic process during fabrication.





# Thank You

