Unified P4 (Power-Performance-Process-Parasitic)
Fast Optimization of a Nano-CMOS VCO

Dhruva Ghai, Saraju P. Mohanty and Elias Kougianos
VLSI Design and CAD Laboratory (VDCL)
University of North Texas
Email: {dvg0010, saraju.mohanty, eliask}@unt.edu

Acknowledgment: This work is partially supported by NSF award number 0702361.
Outline of the Talk

- Introduction
- Novel Contributions
- Related Prior Research
- Proposed Novel RFIC P4 Optimal Design Flow
- Design of 90nm CMOS VCO
- Process Variation Analysis of VCO
- P4 Optimization of VCO
- Conclusions and Future Research
Introduction

- Radio Frequency Integrated Circuits (RFICs) are becoming performance-oriented. The battle to deliver maximum performance has taken center stage. (Performance)

- Power-aware design is required to maximize some performance metric, subject to a power budget. (Power)

- Also, potential yield loss is caused by increasing process variations. Impact of process variations on the performance factors of a design is much higher for today’s nanometer. (Process)

- In high frequency application circuits, the exact performance prediction is challenging due to many parasitic effects. It is crucial to be able to predict parasitic effects for accurate performance. (Parasitic)
Novel Contributions

- A P4 (Power-Performance-Parasitic-Process) optimal design flow for RF circuits is proposed.
- Nano-CMOS current starved VCO subjected to design flow.
- P4 optimization of the VCO is carried out using a dual-oxide process technique.
- A dual-oxide physical design of the VCO is presented for 90nm CMOS technology.
- A novel process variation analysis technique called Design of Experiments-Monte Carlo (DOE-MC) approach is proposed, offering up to 6.25X computational time savings over traditional Monte-Carlo (TMC).
### Related Prior Research

<table>
<thead>
<tr>
<th>Reference</th>
<th>Technology</th>
<th>Performance</th>
<th>Power</th>
</tr>
</thead>
<tbody>
<tr>
<td>Tiebout et. al.</td>
<td>250nm</td>
<td>1.8GHz</td>
<td>20mW</td>
</tr>
<tr>
<td>Dehghani et.al.</td>
<td>250nm</td>
<td>2.5GHz</td>
<td>2.6mW</td>
</tr>
<tr>
<td>Long et. al.</td>
<td>180nm</td>
<td>2.4GHz</td>
<td>1.8mW</td>
</tr>
<tr>
<td>Kwok et. al.</td>
<td>180nm</td>
<td>1.4GHz</td>
<td>1.46mW</td>
</tr>
<tr>
<td><strong>This Work</strong></td>
<td>90nm dual-oxide</td>
<td><strong>2.3GHz</strong></td>
<td><strong>158(\mu)W</strong></td>
</tr>
</tbody>
</table>
Proposed Novel RFIC P4 Optimal Design Flow

- The logical design is done to meet the required specifications.
- Initial physical design is subjected to DRC/LVS/RCLK extraction.
- Worst case process variation analysis of the physical design is done with respect to performance (center frequency).
- Intelligent dual-oxide assignment ($Tox_{pth}$, $Tox_{nth}$) to the power-hungry transistors of the VCO using a thick oxide model file.
- Parasitic netlist is parameterized for parameter set $D$ (widths of transistors and $Tox_{pth}$, $Tox_{nth}$). The parameterized parasitic netlist is subjected to an optimization loop to meet the specifications (performance, power) in a worst case process environment.
- Parameter values for which the specifications are met are obtained, and a final physical design of the VCO is created using these parameter values.
Design of 90nm CMOS VCO

- Current starved VCO design performed using 90nm generic process. Target oscillation frequency \( f_0 \geq 2 \text{GHz} \).

\[
f_0 = \frac{I_D}{N*C_{tot}*V_{DD}}, \quad C_{tot} = \frac{5}{2} C_{ox} (W_p L_p + W_n L_n), \quad C_{ox} = \frac{\varepsilon_{r_{ox}} \varepsilon_0}{T_{ox}},
\]

- \( V_{DD} \): supply voltage, \( I_D \): current flowing through inverter, \( N \): odd number of inverters, \( C_{tot} \): total capacitance of each inverter stage, \( C_{ox} \): gate oxide capacitance per unit area, \( \{W_p, L_p\} \): inverter PMOS width (500nm) and length (100nm), \( \{W_n, L_n\} \): inverter NMOS width (250nm) and length (100nm), \( \{W_{pcs}, L_{pcs}\} \): current starved PMOS width (5um) and length (100nm), \( \{W_{ncs}, L_{ncs}\} \): current starved NMOS width (500nm) and length (100nm).
Design of 90nm CMOS VCO
Process Variation Analysis of VCO: TMC

- Process variation analysis has been carried out on the initial physical design with parasitics extracted (RLCK).

- Variation in 5 parameters:
  - VDD: Supply voltage,
  - Vtn: NMOS threshold voltage,
  - Vtp: PMOS threshold voltage,
  - Toxn: NMOS gate oxide thickness,
  - Toxp: PMOS gate oxide thickness.

- Process parameters assumed to have a Gaussian distribution with mean (μ) as the nominal value in the process design kit, and a standard deviation (σ) as 10% of the mean. TMC with 1000 runs gives the oscillation frequency (f₀) having a Gaussian distribution with μ = 1.54 GHz, σ = 103.5 MHz.
Process Variation Analysis of VCO: DOE-MC

- The DOE-MC methodology offers the advantage of faster computation over TMC.

- A two level full factorial design is run for the 5 process parameters, where:
  - Level 1: \( \mu - 2 \times \sigma \),
  - Level 2: \( \mu + 2 \times \sigma \).

- A full factorial run requires \( 2^5 = 32 \) trials. 5 MC replicate runs are run for every trial. \( \mu(f_0) \) and \( \sigma(f_0) \) are recorded. We obtain 32 values of \( \mu(f_0) \) and \( \sigma(f_0) \).

- The final \( \mu(f_0) \) and \( \sigma(f_0) \) are recorded as the average of the 32 trials. Considering 5 replicates per trial, we get a total of \( 32 \times 5 = 160 \) runs (compared to 1000 TMC runs).
Process Variation Analysis of VCO: DOE-MC

- The DOE-MC technique is less accurate than traditional MC, but saves on computing time.
- The results for MC replicates per trial = 10 and 20 and the percentage error in $\mu$ and $\sigma$ is also presented.
- Worst case process for $f_0$ identified where VDD reduced by 10%, and all the other process parameters (Vtn, Vtp, Toxn, Toxp) are increased by 10%.

<table>
<thead>
<tr>
<th>MC runs per trial</th>
<th>Total runs</th>
<th>% error ($\mu$)</th>
<th>% error ($\sigma$)</th>
<th>Time saving over TMC</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>160</td>
<td>7.47</td>
<td>25.1</td>
<td>6.25X</td>
</tr>
<tr>
<td>10</td>
<td>320</td>
<td>6.78</td>
<td>14.7</td>
<td>2X</td>
</tr>
<tr>
<td>20</td>
<td>640</td>
<td>5.78</td>
<td>10.3</td>
<td>1.5625X</td>
</tr>
</tbody>
</table>
P4 Optimization of VCO

- Logical design center frequency $f_0 = 2$ GHz.
- Initial physical design center frequency $f_{0p} = 1.56$ GHz (25% degradation).
- Initial physical design center frequency in a worst case process variation environment $f_{0p-p} = 1.13$ GHz (43.5% degradation).
- Initial average power consumption (including leakage) $(P_{VCO}) = 212\mu W.$

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Initial Physical Design</th>
<th>Initial Physical Design + Process Variation</th>
<th>Final Physical Design + Process Variation</th>
</tr>
</thead>
<tbody>
<tr>
<td>$f_0$</td>
<td>1.56GHz</td>
<td>1.13GHz</td>
<td>1.98GHz</td>
</tr>
<tr>
<td>discrepancy</td>
<td>25%</td>
<td>43.5%</td>
<td>1%</td>
</tr>
<tr>
<td>$V_{DD}$</td>
<td>1.2V (nominal)</td>
<td>1.08V (-10%)</td>
<td>1.08V</td>
</tr>
<tr>
<td>$V_{tn}$</td>
<td>0.1692662V (nominal)</td>
<td>0.186193V (+10%)</td>
<td>0.186193V</td>
</tr>
<tr>
<td>$V_{tp}$</td>
<td>-0.1359511V (nominal)</td>
<td>-0.149546V (+10%)</td>
<td>-0.149546V</td>
</tr>
<tr>
<td>$T_{oxn}$</td>
<td>2.33nm (nominal)</td>
<td>2.563nm (+10%)</td>
<td>2.563nm</td>
</tr>
<tr>
<td>$T_{oxp}$</td>
<td>2.48nm (nominal)</td>
<td>2.728nm (+10%)</td>
<td>2.728nm</td>
</tr>
</tbody>
</table>
Intelligent Dual-Oxide Assignment

- Average power consumed by all the transistors is measured.
- Input stage transistors (shown by solid circles) consume 48% of total average power of the VCO circuit. Most suitable candidates for higher thickness oxide assignment (Toxpth, Toxnth).
- The buffer stage transistors (shown by dashed circles) consume 11.5% of the total average power. May be treated to higher thickness oxide, for further power minimization.
- Input stage transistors follow thick-oxide model file, other transistors follow baseline model file.
Conjugate Gradient Optimization

- **Input:** Parasitic parameterized netlist, Baseline model file, Thick oxide model file, Objective set $F = [f_0, P_{VCO}]$, Stopping criteria $S$, design variable set $D = [W_n, W_p, W_{ncs}, W_{pcs}, Toxpth, Toxnth]$, Lower design constraint $C_{low}$, Upper design constraint $C_{up}$.

- **Output:** Optimized objective set $F_{opt}$, Optimal design variable set $D_{opt}$ for $S = \pm \beta$, {where $1\% \leq \beta \leq 5\%$}.

Run initial simulation in order to obtain feasible values of design variables for the given specifications.

**while** ($C_{low} < D < C_{up}$) **do**

- Using conjugate gradient, generate new set of design variables $D' = D \pm \delta D$.

- Compute objective set $F = [f_0, P_{VCO}]$.

- **if** ($S == \pm \beta$) **then**

  - **return** $D_{opt} = D'$.

  - **end if**

  - **end while**

- Using $D_{opt}$, construct final physical design and simulate.

- Record $F_{opt}$.
Conjugate Gradient Optimization

- Target center frequency $f_0 = 2$GHz.
- Final Physical design center frequency $f_{0p} = 2.3$GHz.
- Final Physical design center frequency in a worst case process variation environment $f_{0p-p} = 1.98$GHz (1% discrepancy).
- Final average power consumption (including leakage) ($P_{VCO}$) = 158μW (25% reduction).

<table>
<thead>
<tr>
<th>D</th>
<th>$C_{low}$</th>
<th>$C_{up}$</th>
<th>$D_{opt}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>Wn</td>
<td>200nm</td>
<td>500nm</td>
<td>210nm</td>
</tr>
<tr>
<td>Wp</td>
<td>400nm</td>
<td>1μm</td>
<td>415nm</td>
</tr>
<tr>
<td>Wncs</td>
<td>1μm</td>
<td>10μm</td>
<td>8.5μm</td>
</tr>
<tr>
<td>Wpcs</td>
<td>5μm</td>
<td>10μm</td>
<td>5μm</td>
</tr>
<tr>
<td>Toxpth</td>
<td>2.48nm</td>
<td>5nm</td>
<td>5nm</td>
</tr>
<tr>
<td>Toxnth</td>
<td>2.33nm</td>
<td>5nm</td>
<td>3.54nm</td>
</tr>
</tbody>
</table>
P4 Optimal Dual-Oxide Logical Design

Input Stage

N = 13 Stages

Buffer Stage

UNIVERSITY OF NORTH TEXAS
Discover the power of ideas
P4 Optimal Dual-Oxide Physical Design
Conclusions and Future Research

- Design of a P4 (Power-Performance-Parasitic-Process) optimal nano-CMOS VCO is proposed. The presented design flow may be used for optimization of nanoscale circuits in general.
- The center frequency treated as the target specification. The degradation of the center frequency due to parasitic and process variation effects narrowed down from 43.5% to 1%, along with 25% power minimization using dual-oxide technique.
- The end product of the proposed design flow is a P4 optimal dual-oxide VCO physical design.
- For future research, we plan to consider thermal effects in the VCO design as well.
- Alternative optimization techniques such as simulated annealing and genetic algorithms are also being explored for a fair comparison of the P4 design flow with other approaches.
Thank you!