## Multi-rate Polyphase DSP and LMS Calibration Schemes for Oversampled ADCs

# Subhanshu Gupta, Yi Tang, Jeyanandh Paramesh & David J. Allstot

#### **Journal of Signal Processing Systems**

for Signal, Image, and Video Technology (formerly the Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology)

ISSN 1939-8018 Volume 69 Number 3

J Sign Process Syst (2012) 69:329-338 DOI 10.1007/s11265-012-0677-3 Journal of SIGNAL PROCESSING SYSTEMS for Signal, Image, and Video Technology

Volume 69, No. 2, November 2012 Editor-in-Chief S. Y. Kung



#### CONTENTS

An Area-Efficient 4-Stream FIR Interpolation/ Decimation for IEEE 802.11n WLAN Z-d.Zhang - B. Wu - Y.-X. Zhu - Y.-m. Zhou - 115 Efficiency Enhancement of Sigma-Delta Modulator Based Transmitters Using Multi-Level Quantizers Based Tansmitters Using Multi-Level Quantizers M. Ghannouch i 25 An Efficient Block Entropy Based Compression Scheme for System-son-ac-tip Test Data S. Zahir - A. Borisi - 133 Implementation and Optimization of an Enhanced PVD Metric for H.264/AVC on a TMS230C64 DSP A. Samet - A. Hachicka - M.A.B. Ayed -N. Masmoul 143  Designing Fast Fourier Transform Accelerators for Orthogonal Frequency-Division Multiplexing Systems W. Hassain - F. Garzia - T. Monen - J. Nurmi 161
 Instruction Cache Locking for Embedded Systems using Probability Profile
 T. Liu - M. Li - C. J. Xue 173
 A Fast Architecture for H.264/AVC Deblocking Filter Using a Clock Cycles Saving Process
 M. Torabi - A. Vafaci 189
 Scalable Canssian Normal Basis Multipliers over GF(2<sup>o</sup>) Using Hankel Matrix-Vector Representation C.-Y. Lee - C.W. Chicu 197
 Two-Symbol FGA Architecture for Fast Arithmetic Encoding in JPEG 2000
 N. Ramesh Kumar - W. Xiang - Y. Wang 213

ISSN 1939-8018

Available \*\*\* online



2 Springer

Your article is protected by copyright and all rights are held exclusively by Springer Science+Business Media, LLC. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your work, please use the accepted author's version for posting to your own website or your institution's repository. You may further deposit the accepted author's version on a funder's repository at a funder's request, provided it is not made publicly available until 12 months after publication.



### Multi-rate Polyphase DSP and LMS Calibration Schemes for Oversampled ADCs

Subhanshu Gupta • Yi Tang • Jeyanandh Paramesh • David J. Allstot

Received: 3 November 2011 / Revised: 28 April 2012 / Accepted: 30 April 2012 / Published online: 7 June 2012 © Springer Science+Business Media, LLC 2012

Abstract A scaling-friendly approach for the low-power calibration of oversampled analog-to-digital (A/D) systems is presented. A 22-dB amplifier relaxes the design constraints of the analog front-end (AFE). The integrator non-idealities in the AFE of the sigma-delta ( $\Sigma\Delta$ ) ADC are calibrated using a multi-rate polyphase least-mean squares (LMS) algorithm. The proposed half- ( $f_s/2$ ) and quarter-rate ( $f_s/4$ ) LMS calibration schemes reduce computational complexity and achieve more than 2.5× savings in digital power consumption for low-OSR (over-sampling ratio)  $\Delta\Sigma$  ADCs, which require higher adaptive filter orders and sampling frequencies. The proposed scheme can have further applications in serial-link I/O and sub-band echo cancellation architectures.

**Keywords** Multi-rate · Adaptive LMS · Decimation · Polyphase decimation · Noble identities · Sigma-delta · Delta-sigma · Oversampled ADC · Fixed-point multi-rate implementation · Low-gain opamp · Calibration accuracy

S. Gupta · D. J. Allstot Department of Electrical Engineering, Univ. of Washington, Seattle, WA 98195, USA

S. Gupta (⊠) RFIC Engineering, Maxlinear Inc., 16275 Laguna Canyon Rd, Suite 160, Irvine, CA 92612, USA e-mail: subhanshu.gupta@gmail.com

Y. Tang Qualcomm Inc., San Diego, CA 92121, USA

J. Paramesh Department of Electric

Department of Electrical and Computer Engineering, Carnegie Mellon Univ., Pittsburgh, PA 15213, USA

#### **1** Introduction

Traditionally, communication receivers, wireless/wireline transceivers, tv tuners etc., have enjoyed the relaxed requirements of Analog-Digital Converters (ADC), with most of the gain and channel filtering being performed in the analog domain before digitization. Recently, however, the growth in wide-bandwidth applications with reduced power and higher receiver efficiency has forced most of these operations to the digital domain. These applications require a resolution of 8–12 bits in the bandwidth range of 1–40 MHz. Concurrently, such architectures must also minimize power consumption.

Nyquist-rate pipeline and over-sampled  $\Sigma\Delta$  converters have been typically used for these applications. The former can realize 10–12 bits of resolution [1] but require digital background calibration or error correction, resulting in increased complexity.  $\Sigma\Delta$  converters are an attractive solution and can achieve higher resolutions of 12–16 bits with a low over-sampling ratio (OSR) typically of the order of 8–16X [2–4]. For fixed signal bandwidths, low OSR means a lower sampling frequency on the order of a few hundred MHz thereby reducing the design complexity of the anti-aliasing and decimation filters that follow. In essence, therefore, a low-OSR design paradigm shifts the modulator design challenge from maximizing speed to minimizing power consumption.

CMOS scaling provides ever faster and more energyefficient digital gates at the cost of analog circuits with greater process, voltage, and temperature (PVT) variations, less gain per stage, lower linearity, reduced supply voltages, etc. As a consequence, the pure analog design approach of the past that relies on well-matched passive and active components (e.g., op-amps with high gain, linearity, and output swings, etc.) has given way to a new mixed-signal design paradigm—digitally-assisted analog circuits [5–7]. This approach enables high performance using DSP and adaptive calibration techniques to estimate and correct errors associated with the use of low-performance analog circuits. Therefore, it is necessary to choose an analog frontend architecture that consumes low power and provides sufficient linearity for use with linear (e.g., least-meansquare (LMS)) calibration techniques. An equally important goal, which has received much less attention, is the development of low-power back-end DSP/calibration architectures. Overall energy efficiency is also critical as the calibration needs to be done every few packets in a typical communications application. Noble identities and polyphase filtering techniques enable decimation prior to calibration in an oversampled  $\Sigma\Delta$  ADC. Consequently, the overall dynamic power dissipation is reduced significantly with only a modest increase in the complexity of the digital circuitry [8]. These techniques are validated in this work by the demonstration of a complete  $\Sigma\Delta$  ADC front-end and digital backend.

Section 2 describes the ADC front-end implemented as cascaded  $\Sigma\Delta$  ADC stages. It uses op-amps with only 22-dB gain to save power and relax design constraints in sub-nm low-voltage CMOS technologies. Section 3 briefly reviews the system-level trade-offs associated with the use of these low-gain op-amps. Section 4 describes the sign-data LMS algorithm and its hardware implementation. Section 5 presents the proposed calibration schemes for the noise-cancelling digital back-ends, and compares the full-, half- and quarter-rate LMS architectures. Measurement results are given in Section 6 followed by conclusions in Section 7.

#### 2 Cascade $\Sigma\Delta$ ADC

In contrast to the stability concerns characteristic of high-order (L>2) single-stage architectures, a two-stage cascade modulator with second-order loops in both stages provides fourth-order noise shaping with inherent stability [2–4]. A block diagram representation of a 2–2 cascade  $\Sigma\Delta$  ADC is shown in Fig. 1(a) and a simplified form in terms of the signal and noise transfer functions is shown in Fig. 1(b). Figure 1(c) shows the hardware switched-capacitor implementation of the ADC front-end. Consider the cascade modulator of Fig. 1(b) in which the quantization noise of the first stage,  $Q_1(z)$ , is extracted and re-quantized by the second stage. The outputs of the two stages,  $Y_1(z)$  and  $Y_2(z)$ , are input to the noise-cancellation logic (NCL) block and combined.

It is easily shown that [9]:

$$\begin{aligned} Y_1(z) &= STF1_A(z)X(z) + NTF1_AQ_1(z) \\ Y_2(z) &= STF2_A(z)Q_1(z) + NTF2_A(z)Q_2(z) \end{aligned}$$
 (1)



Figure 1 (a) System architecture of the ADC; (b) Simplified representation of the cascade scheme; (c) Discrete-time implementation using switched-capacitor circuits.

where  $STF1_A(z)$  and  $STF2_A(z)$  are analog signal transfer functions, and  $NTF1_A(z)$  and  $NTF2_A(z)$  are analog noise transfer functions of the first and second stages, respectively. It follows that,

$$Y(z) = STF1_{A}(z)STF2_{D}(z)X(z) + \begin{bmatrix} NTF1_{A}(z) \cdot STF2_{D}(z) - \\ NTF1_{D}(z)STF2_{A}(z) \end{bmatrix} Q_{1}(z) -NTF1_{D}(z)NTF2_{A}(z)Q_{2}(z)$$

$$(2)$$



Figure 2 Finite-gain non-inverting switched-capacitor integrator and non-overlapping two-phase clocking scheme.

If the noise-cancellation filters are designed such that,

 $STF2_D(z) = STF2_A(z)$  $NTF2_D(z) = NTF2_A(z)$ 

then,

$$Y(z) = STF1_A(z)STF2_D(z)X(z)$$
  
-NTF1\_D(z)NTF2\_A(z)Q\_2(z)

The quantization noise of the first stage,  $Q_1(z)$ , is ideally cancelled leaving only the quantization noise of the second stage,  $Q_2(z)$ ; note, however, that  $Q_2(z)$  is high-order noise shaped by both stages. According to (2), complete cancellation of  $Q_1(z)$  mandates matching analog signal and noise transfer functions that change with process, voltage, and temperature (*PVT*) variations to their digital filter equivalents in the NCL that do not. Also note from (2) that any mismatches between the analog and digital transfer functions cause leakage to the output of a fraction of  $Q_1(z)$ , typically noise-shaped with a lower order.

#### 3 Low-OSR $\Sigma\Delta$ ADCS with Low-Gain OP-AMPS

The integrators in  $\Sigma\Delta$  ADCs usually employ op-amps with DC gains of 50–75 dB. In deep-submicron CMOS technology, however, this high gain is difficult to achieve and comes at the cost of high power consumption and design complexity. Generally, high op-amp gain is required to reduce integrator pole/gain errors, suppress op amp non-linearities, reduce the ADC dead zone, and minimize the input-referred noise of downstream stages.

Shown in Fig. 2 is a non-inverting switched-capacitor integrator with parasitic capacitance  $C_p$  and two-phase non-overlapping clocks  $\Phi_1$  and  $\Phi_2$ . With infinite DC gain, its transfer function assuming the output is sampled on  $\Phi_2$  is:

$$\frac{Vout(z)}{Vin(z)} = \frac{C_s}{C_F} \cdot \frac{z^{-1}}{1 - z^{-1}}$$

whereas with finite gain, A, the transfer function now includes gain  $(\varepsilon_g)$  and pole  $(\varepsilon_p)$  errors,

$$\frac{V_{out}(z)}{V_{in}(z)} = \left(\frac{C_s}{C_F}\right) \frac{\varepsilon_g \cdot z^{-1}}{1 - \varepsilon_p \cdot z^{-1}}$$

defined as:

$$\varepsilon_g = \frac{1}{\frac{C_s}{A \cdot C_F} + \left(1 + \frac{1}{A}\right) + \frac{C_p}{A \cdot C_F}}$$

and 
$$\varepsilon_p = \frac{\left(1 + \frac{1}{A}\right) + \frac{C_p}{A \cdot C_F}}{\frac{C_s}{A \cdot C_F} + \left(1 + \frac{1}{A}\right) + \frac{C_p}{A \cdot C_F}}$$

With a non-zero pole error, the integrator is leaky; i.e., its DC gain is no longer infinite. Hence, in  $\Sigma\Delta$  ADCs implemented with leaky integrators, noise shaping is degraded because the zeros of the noise transfer function (NTF) deviate from their ideal positions.

For example, the output power spectral density of an ideal fourth-order modulator is shaped with  $NTF_{ideal} = (1 - z^{-1})^4$  (four zeros at z=1) and its practical counterpart with  $NTF_{real} = (1 - 0.8z^{-1})^4$  (four zeros at z=0.8 due to low op-amp gain) is shaped as shown in Fig. 3. Clearly, the quantization noise of the non-ideal modulator is much



Figure 3 Ideal and non-ideal ( $A_{opamp}=10 \text{ V/V}$ ) fourth-order noise transfer functions (modeled using Matlab/Simulink).

higher at low (in-band) frequencies. At higher out-of-band frequencies, however, the quantization noise spectra exhibit identical fourth-order (80 dB/decade) noise shaping. For a high OSR=36.25, which corresponds to a bandwidth of [0, 2 MHz], the ideal fourth-order modulator has a signal-toquantization-noise ratio (SQNR) of 120 dB whereas its lowgain counterpart has SQNR=96 dB-four bits less than ideal. In contrast, for a low OSR=7.25, which corresponds to a [0, 10 MHz] bandwidth, quantization noise at high frequencies dominates so that the SQNR values differ by less than 1 dB. Therefore, the SQNR of a low-OSR design is relatively insensitive to the integrator gain/pole errors that degrade quantization noise filtering at low frequencies. This key characteristic enables the use of simple low-gain, lowpower, wideband op-amps for OSR=4-8 designs. Indeed, behavioral simulations including non-linear integrator models and other non-idealities and measurements from a 0.13  $\mu$ m CMOS prototype confirm that A>10 is adequate for a fourth-order modulator with OSR=7.25.

A feedforward low-distortion topology is selected with a 4-bit quantizer to limit the integrator output swings of the first (second) integrator,  $V_1$  ( $V_2$ ) to about ±1 LSB (±0.5 LSB) [10]. A key advantage of this approach is that because  $n^{th}$ -harmonic distortion is proportional to ( $V_{swing}/A$ )<sup>n</sup>, a small integrator output swing reduces the gain required for nonlinearity suppression. Therefore, a multi-bit feed-forward modulator is able to accommodate op-amps in the integrators with lower DC gain than in the feedback counterpart. Another key advantage of the feed-forward topology is that the quantization noise,  $Q_N$ , is available directly at the output of the second integrator:

$$V_{1b} = z^{-2}Q_N$$

This feature is important in high-order cascade modulators. For example, the quantization noise of a second-order feed-forward first stage is easily digitized by feeding  $V_{1b}$ directly to the second modulator stage as in Fig. 1(a). In contrast, additional DACs are required to extract the quantization noise of the first-stage modulator in the conventional feedback architecture.

#### **4 Digital Noise Calibration Filter**

With a non-zero pole error, the integrator is leaky; i.e., its DC gain is no longer infinite. Hence, in a  $\Sigma\Delta$  ADC implemented with leaky integrators (Fig. 4(a)), the noise shaping (and thus the overall *SNR*) is degraded after noise cancellation because the zeros of the analog noise transfer functions deviate from their ideal locations.



Figure 4 (a) Conventional noise cancellation; (b) sign-data LMS scheme; (c) noise-cancellation with polyphase decimation-by-2 and half-rate LMS, and (d) with polyphase decimation-by-4 and quarter-rate LMS.

Digital calibration is used to minimize the leakage of  $Q_1(z)$  to the output so as to maximize the *SNR*. To ensure sufficient cancellation of  $Q_1(z)$ , the coefficients of the NCL filters are adaptively calibrated to match those of the corresponding *on-chip* analog transfer functions (Eqns. (1) and (2)) using a novel combination of polyphase decimation in a sign-data LMS algorithm. The sign-data LMS algorithm is a simplified version of the standard LMS algorithm which makes it easier to implement on digital signal processing (DSP) devices, ASIC and FPGA boards.

During calibration, the ADC input is grounded ( $V_{in}=0$ ) and a linear feedback shift register (LFSR) generates a zeromean pseudo-random sequence  $T_i(n)$  (Fig. 4(b)).  $T_i(n)$  is added to the output of the first-stage quantizer where it is effectively filtered by  $NTFI_A$ . At the same time, a pseudo random sequence,  $T_i(n)$ , is input to an 8-tap finite impulse response (FIR) filter,  $NTFI_D$ , whose coefficients are adaptively adjusted using the sign-data LMS algorithm [11]:

$$W(n+1) = W(n) + 2 \cdot \mu \cdot e(n) \cdot \operatorname{sgn}(T_i(n))$$

where  $\mu$  is the adaptation step size and e(n), the error signal, is the difference between the outputs of the first modulator stage,  $Y_1(n)$ , and the  $NTFI_D$  FIR filter output, V(n). Note that when  $T_i(n)$  is zero the algorithm does not involve a multiplication operation. With  $T_i(n)$  non-zero, only one multiplication operation is required. Also, when the step size,  $\mu$ , equals a power of 2, only a binary shift operation is needed rather than a multiplication. These features make the hardware implementation of sign-data LMS easier compared to the standard LMS algorithm, however, at a cost of a slower convergence speed and a larger steady-state error.

The coefficients, W(n), of the digital FIR filter,  $NTFI_D$ , are continuously adjusted until they converge to the optimum vector,  $W_O(n)$ , wherein e(n) is minimized in the LMS sense; i.e.,

$$rac{\partial (E[e^2(n)])}{\partial W_o(n)} = 0 \Rightarrow W_o(z) = rac{\Phi_{Y1Ti}(z)}{\Phi_{TiTi}(z)}$$

Under the condition that  $T_i$  and  $Q_1$  are uncorrelated stationary signals and  $E[T_i]=0$ ,

$$W_o(Z) = \frac{\Phi_{Y1Ti}(Z)}{\Phi_{TiTi}(Z)}$$
$$= \frac{NTF1_A(Z)\Phi_{TiTi}(Z)}{\Phi_{TiTi}(Z)}$$
$$= NTF1_A(Z)$$

Ideally, the FIR filter NTF1<sub>D</sub> will be perfectly matched to NTF1<sub>A</sub> after calibration. However, the correlation nature of the LMS algorithm limits the calibration accuracy. A large number of iterations are used in conjunction with a small calibration step size to improve the accuracy at the expense of calibration time and circuit complexity. The calibration signal  $T_i(n)$  (Fig. 4(b)) is designed to have discrete levels of  $\pm 1$  LSB or  $\pm 2$  LSB. A 32-tap linear-feedback shift register is used to generate the two signal levels with equal probability to ensure a zero average value; i.e.,  $E[T_i]=0$ . The noisecancellation filters (NCF) are implemented as FIR filters because they are inherently stable; however, they do exhibit truncation errors compared to their analog IIR counterparts. Because the LMS error,  $E[e^{2}(n)]$ , has a single global minimum for FIR filters, the filter coefficients W(n) will always converge to the optimum value  $W_{\rm O}(n)$ , but, more taps are needed to reduce the truncation errors in the FIR-based NCF filters. Extensive simulations show that 8-tap FIR filters used in the noise-cancellation filters are sufficient to achieve the required dynamic range in the ADC.

In multi-channel receiver systems, offline calibration can be done easily during channel switching. Each channel switching operation takes a few milliseconds to accomplish. Operating the LMS calibration circuits at the full sampling rate of 150 MHz can significantly increase the power consumption. Custom ASIC designs of the constituent adders/ multipliers (not implemented here) can save energy [12]. However, an even simpler and more energy efficient solution is to operate the NCF and LMS DSP sections at decimated frequencies. Even if the LMS calibration filter is not operating, it will help achieve better power efficiency compared to the conventional decimation filter implementation [13].

4.1 Conventional Design of Decimation Filters for Oversampled Systems

For efficient decimation filtering, a three-stage approach is used. A cascaded sinc  $(\operatorname{sinc}^{K}(\pi f/f_{D}))$  filter, clocked at  $f_{s}$ , is used first to reduce the sampling rate from  $f_{s}$  to the intermediate clock rate  $f_{D}$ . The sinc filter is followed by a sharp lowpass filter clocked at the reduced rate  $f_{D}$ . The second stage is implemented with a cascade of half-band FIR filters and is followed by a third-stage to incorporate compensation for the droop introduced by the sinc<sup>K</sup> filter (shown in Fig. 4(a)).

#### 4.2 Sinc Filter Order

The determining factors in finding the order K of the sinc<sup>K</sup> filter for an  $L^{\text{th}}$ -order  $\Delta \Sigma$  modulator are:

- i. The filter should cut-off at a faster rate near  $f_{\rm B}$  than the NTF of the  $\Delta\Sigma$  modulator rises there;
- ii. Its gain response should be flatter near  $f_s$ /OSR and its harmonics than the NTF is near DC.

Condition (i) insures that very little out-of-band noise is left unsuppressed around  $f=f_{\rm B}$  after decimation, while condition (ii) guarantees that the folding of the noise from frequency bands around  $f_s/OSR$ ,  $2f_s/OSR$ , etc., after decimation adds



Figure 5 LMS Polyphase decomposition using the Noble Identities.

Figure 6 (a) Design flow for adaptive LMS polyphase decimation  $\blacktriangleright$  filter; (b) Unit LMS cell highlighting the critical paths (*red*, *blue*). Register retiming helps to break critical path without affecting accuracy.

little to the in-band noise. Both conditions require K>L; usually, K=L+1 is adequate. This condition is revisited in the next section for half-rate and multi-rate LMS schemes. In this case, L=4, and thus a sinc<sup>5</sup> filter is adequate for both the signal and calibration modes.

#### 4.3 Noble Identities and Polyphase Decomposition

In the most straightforward implementation of the decimation filter shown on the left in Fig. 6, the filter computes an output sample at each value of *n*, but then only one of every M output points is retained. To obtain a more efficient implementation, polyphase decomposition of the filter (Fig. 5) is done [13, 14]. If the original implementation requires a filter with N multiplications and (N-1) additions per unit time, the polyphase implementation requires only 1/M(N/M) multiplications per unit time and 1/M(N/M - 1) additions per unit time and the entire system then requires (N/M) multiplications and (N/M - 1) +(M-1) additions per unit time. Hence, a significant computation and therefore energy savings can be achieved for some values of M and N. This technique is extended to the reduced-rate least-mean square calibration schemes presented next for noise-cancellation filters combined with decimation filters.

#### 5 Proposed Half- and Quarter-Rate LMS Schemes

The Noble identities and polyphase decimation methods enable the realization of the NCF and the LMS blocks at decimated frequencies. Mathe [15] previously disclosed decimation of the NCF (but not LMS) filters for a cascade band-pass  $\Sigma\Delta$  ADC. Herein, his work is extended to a lowpass  $\Sigma\Delta$  ADC; LMS adaptation with a reduced sampling rate is introduced that provides the same calibration accuracy using less power. To our knowledge, this is the first demonstration of decimation before both the LMS calibration and DSP filtering blocks. Steps to realize the half- and quarter-rate LMS schemes are:

- a) An ideal coefficient set for the NCF is selected and convolved with a *sinc* function (from the first-stage of the decimation filter) to obtain an initial set. The order of the *sinc* filter, K, is set to K=L+1.
- b) Use polyphase decomposition to obtain the coefficients after the down-sampler for the set  $E(z)=[E_1(z), E_2(z) \dots E_M(z)]$  as shown in Fig. 6 for M=2. For the quarter-rate LMS, a similar procedure is adopted with M=4.



- c) The ADC input X(z)=0, and the 32-bit LFSR input is activated and the ADC outputs  $Y_1(z)$  and  $Y_2(z)$  are used to calibrate the filters  $E_{STF}(z)$  and  $E_{NTF}(z)$  as in Fig. 4 (c) (d). Care should be taken to match the delays before the error is computed by the LMS block. Also, to minimize aliasing during calibration, the order of the cascaded integrator-comb (CIC) filter is set to K=L+2. This LMS block is OFF during normal operation and does not consume power.
- d) Steps (a)-(c) are iterated varying the step size and coefficient widths for the required dynamic range. To facilitate tuning the coefficients (word length, step-size, etc.) for the filters, the *Simulink®* HDL coder is used with an *Altera Stratix II* FPGA. A flowchart representation of the process is shown in Fig. 6(a). Register retiming is required to reduce the hardware critical path constraints shown in Figs. 6(b).

#### **6 Measurement Results**

The ADC is implemented in 0.13  $\mu$ m CMOS with a core area of 2.3 mm×0.75 mm for the analog front-end (Fig. 7). The digital back-end is synthesized in 65 nm CMOS process with a core area of 0.12 mm<sup>2</sup> (0.21 mm<sup>2</sup>) for conventional LMS (polyphase 4X) scheme.

Figure 8(a) shows the measured ADC SNDR with no calibration. After calibration, the ADC achieves a measured peak SNR=67 dB, SNDR=66 dB, and SFDR=75 dB for  $f_{in}=1$  MHz. All the three schemes, namely, conventional LMS, polyphase 2X/4X, achieve 11-bit accuracy as shown in Fig. 8(b), (c) and (d) respectively.

Table 1 shows the area and power summary of the three implementations from the synthesized reports. Note that the implementations of Fig. 4(c) and (d) reduce the power consumption by 30 % and 55 % compared to the conventional solution of Fig. 4(b) at the cost of 40 % and 75 % more hardware, respectively and calibration time. Table 2 compares calibration accuracy and time for conventional LMS and polyphase  $4\times$  schemes. Note that the calibration time for polyphase  $4\times$  scheme is higher compared with the



Figure 7 Chip micrograph.



Figure 8 (a) Output spectrum (no calibration); (b) Output spectrum with conventional LMS scheme; Word length=20 bits and step size= 2e-13 (c) Output spectrum for half-rate LMS; Word length=20 bits and step size=2e-15; (d) Output spectrum for quarter-rate LMS; Word length=24 bits and step size=2e-17.

conventional LMS schemes. The offline LMS calibration can be done easily during this interval. Table 3 provides a comparison to other works that employ LMS calibration. The proposed scheme in this work has one of the lowest analog front-end power consumptions. Significant digital

| Table 1   | Measured    | results | summary | of the | (a) | Analog | front-end | and |
|-----------|-------------|---------|---------|--------|-----|--------|-----------|-----|
| (b) Digit | al back-end | l.      |         |        |     |        |           |     |

| (a)                                              |                            |
|--------------------------------------------------|----------------------------|
| Sampling Frequency (MHz)                         | 150                        |
| Signal Bandwidth (MHz)                           | 9.4                        |
| Oversampling Ratio                               | 8                          |
| Peak SNR/SNDR/SFDR (dB)                          | 67/66/75                   |
| Input Range (Differential) (V <sub>p-p</sub> )   | 2.4                        |
| $+$ $V_{ m ref}/V_{cm}/-{ m V}_{ m ref}({ m V})$ | 1.2/0.6/0.0                |
| Analog Power Consumption (mW)                    | 31.5                       |
| Power Supply Voltage (V)                         | 1.25                       |
| Analog Front-End Chip Area (mm <sup>2</sup> )    | 1.73                       |
| CMOS Process                                     | 0.13 µm                    |
| (b)                                              |                            |
| Type of DSP Pack and Overall <sup>a</sup> Area   | <sup>b</sup> <b>P</b> ower |

| Type of DSP Back-end Overall                         | <sup>a</sup> Area | <sup>b</sup> Power |
|------------------------------------------------------|-------------------|--------------------|
| 8× Decimation                                        | (# Gates)         | Disspation (mW)    |
| Conv. filter (@ 150 MHz)+<br>8× Decimation           | 40127             | 60                 |
| Polyphase Filter by 2X<br>(@75 MHz)+4× Decimation    | 55728             | 38                 |
| Polyphase Filter by 4X<br>(@ 37.5 MHz)+2× Decimation | 70532             | 22                 |

<sup>a</sup> Area in term of NAND gate equivalents

<sup>b</sup> Synthesized power estimate from Encounter<sup>®</sup> in Calibration Mode

power savings are realized both in the calibration and normal operation modes and are only bound to improve with scaling to lower CMOS technology nodes.

### 7 Conclusions

Sub-nm low-voltage CMOS amplifier designs require simple and low-power digital intensive scaling-friendly approaches to take advantage of faster process nodes. This

| Table 2 | 2 Ca | alibration | accuracy | against | time |
|---------|------|------------|----------|---------|------|
|---------|------|------------|----------|---------|------|

| Type of DSP Back-end<br>Overall 8X Decimation | No. of Steps/<br>Cal Time<br>(mSec) | Step size<br>(µ) (2 <sup>-µ</sup> ) | SNDR<br>(dB) | SFDR<br>(dB) |
|-----------------------------------------------|-------------------------------------|-------------------------------------|--------------|--------------|
| Conv. filter+8×                               | 2^19/3.5                            | 13                                  | 66           | 75           |
| Decimation                                    | 2^19/3.5                            | 10                                  | 66           | 71           |
|                                               | 2^17/0.9                            | 13                                  | 38           | 31           |
| Polyphase Filter by 4×+                       | 2^19/14                             | 12                                  | 63           | 65           |
| 2× Decimation                                 | 2^19/14                             | 14                                  | 66           | 72           |
|                                               | 2^19/3.5                            | 17                                  | 66           | 75           |
|                                               | 2^15/0.43                           | 14                                  | 58           | 64           |
|                                               | 2^14/0.43                           | 14                                  | 53           | 52           |
|                                               |                                     |                                     |              |              |

Table 3 Comparison with recent works.

|                                             | This work (No Polyphase)  | This work (Polyphase 4×)  | Bosi [4]       | Breems [2]           | Shu [3]      |
|---------------------------------------------|---------------------------|---------------------------|----------------|----------------------|--------------|
| Architecture                                | 2-2 DT                    | 2-2 DT                    | 2-2 CT         | 2-2 CT               | 2-1-1 CT     |
| DSR                                         | ∞                         | ∞                         | 4              | 8                    | 10           |
| SNR/SNDR/SFDR (dB)                          | 67/66/75                  | 67/66/75                  | 75/-/88        | 67/63/67             | 64/62/68     |
| First Integrator Gain (dB)/Current (mA)     | 22/2                      | 22/2                      | -/10           | I                    | 54/-         |
| First Integrator Top`ology                  | Diff-pair                 | Diff-pair                 | Folded Cascode | Gm-C                 | Telescopic   |
| Sampling Freq (MHz)                         | 150                       | 150                       | 80             | 160                  | 360          |
| Calibration Filter Order                    | 8-tap FIR (sign-data LMS) | 8-tap FIR (sign-data LMS) | 6tap FIR (LMS) | 6-tap FIR (Var. Est) | IIR (LMS)    |
| Power(analog) mW                            | 31.5                      | 31.5                      | 185            | 104                  | 183          |
| (digital) mW                                | 60                        | 22                        | 55             | 14.4                 | 47           |
| Area (analog) mm <sup>2</sup>               | 1.73                      | 1.73                      | 2.5            | 1.7                  | 0.68         |
| (digital) mm <sup>2</sup>                   | 0.12                      | 0.21                      | 1.5            | 0.09                 | 0.59         |
| Tech (Analog Front-End/Digital<br>Back-End) | CMOS 0.13 µm/65 nm        | CMOS 0.13 µm/65 nm        | CMOS 0.18 µm   | CMOS 0.18 µm         | CMOS 0.18 µm |
|                                             |                           |                           |                |                      |              |

work presents a low-gain 22-dB only differential-pair based amplifier that substantially eases design effort with lower power consumption. Quarter-rate and half-rate polyphase LMS and DSP schemes digitally calibrate the mismatches in the integrator gain/pole errors at  $2.5 \times$  reduced power compared to conventional LMS schemes. This approach enables high performance using DSP and adaptive calibration techniques to estimate and correct errors associated with the use of low-performance analog circuits. The proposed LMS schemes are also applicable to serial-link I/O and sub-band echo cancellation architectures.

#### References

- Peach, C. T., Moon, U.-K., & Allstot, D. J. (2010). An 11.1 mW 42 MS/s 10b ADC with two-step settling in 0.18 μm CMOS. *IEEE Journal of Solid-State Circuits*, 45, 391–400.
- Rutten, R., Breems, L. J., & Wetzker, G. (2006). Digital calibration of a continuous-time cascaded ΣΔ modulator based on variance derivative estimation. *IEEE Processing European Solid-State Circuits Conference*, pp. 199–202.
- Shu, Y.-S., Kamiishi, J., Tomioka, K., Hamashita, K., & Song, B.-S. (2010). LMS-Based noise leakage calibration of cascaded continuous-time modulators. *IEEE Journal of Solid-State Circuits*, 45, 368–379.
- Bosi, A., Panigada, A., Cesura, G., & Castello, R. (2005). An 80 MHz 4X oversampled cascaded ΣΔ-pipeline ADC with 75 dB DR and 87 dB SFDR. *IEEE International Solid- State Circuits Conference*, pp. 174–175.
- Murmann, B. (2006). Digitally assisted analog circuits. *IEEE Micro*, 26, 38–47.
- Cauwenberghs, G., & Temes, G. C. (2000). Adaptive digital correction of analog errors in MASH ADCs-I. Off-line and blind on-line calibration. *IEEE Transactions Circuits and Systems-II: Analog and Digital Signal Processing*, 47, 621–628.
- Kiss, P., Silva, J., Wiesbauer, A., Sun, T., Moon, U.-K., Stonick, J. T., et al. (2000). Adaptive digital correction of analog errors in MASH ADCs. II. Correction using test-signal injection. *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, 47, 629–638.
- Gupta, S., Tang, Y., Cheng, K.-W., Paramesh, J., & Allstot, D. J. (2011). Multi-rate polyphase DSP and LMS calibration schemes for oversampled data conversion systems. *IEEE Acoustics, Speech and Signal Processing Conference*, pp.1585–1588.
- 9. Schreier, R., & Temes, G. C. (2004). Understanding delta sigma data converters. IEEE Wiley Press.
- Tang, Y., Gupta, S., Paramesh, J., & Allstot, D. (2007). A digitalsumming feedforward ΣΔ modulator and its application to a cascade ADC. *IEEE International Symposium* on *Circuits and Systems*, May pp. 485–488.
- 11. Diniz, P. R. (2002). Adaptive filtering: Algorithms and practical implementation (2nd ed.). Boston, MA: Kluwer Academic. Ch. 3.
- Staszewski, R. B. (2000). A 550-MSample/s 8-Tap FIR digital filter for magnetic recording read channels. *IEEE Journal of Solid-State Circuits*, 35, 1205–1210.
- Ahmed, N. Y., Ashour, M. A., & Nassar, A. M. (2009). Power efficient polyphase comb decimation filters for ΣΔ modulators in multi-rate digital receivers. *European Conference on Circuit The*ory and Design, pp. 719–722.

- Oppenheim, A. V., & Schafer, R. W. (1989). Discrete-time signal processing. Englewood Cliffs, N.J.: Prentice Hall.
- Mathe, L. K.-A. (2001). Noise cancellation circuit in a quadrature downconverter. U.S. Patent #6243430, June 5, 2001.



**Subhanshu Gupta** received the B.S. degree from the National Institute of Technology, Trichy, India in 2002, and the M.S. and Ph.D. degrees from the University of Washington in 2006 and 2010, respectively.

He interned with National Semiconductor (now Texas Instruments), Santa Clara, CA from 2005-2006 where he worked on high-resolution sigma-delta ADCs. Since 2011, he has been with Maxlinear Inc., Irvine, CA, working on wideband analog-to-digital converters for satellite/cable TV applications. He received the Analog Devices Outstanding Student Designer Award in 2008 and an IEEE RFIC Symposium Best Student Paper Award in 2011. His current research interests include architectures for direct-RF wideband sampling ADCs, techniques for mitigating blockers/interferers in radio receivers and applications of digital signal processing algorithms to CMOS receiver design for lowpower and area efficiency.



**Yi Tang** received the B.S. degree in electrical engineering from the University of Electronic Science & Technolog of China (Chengdu, China), in 1996, the M.E. degree from the University of Utah in 2002, and the Ph.D. degree from the University of Washington in 2007.

She is currently with Qualcomm research center in San Diego. Her interests are low power analog and mixed signal CMOS integrated circuit and system design.

## Author's personal copy



Jeyanandh Paramesh received the B.Tech, degree from IIT, Madras, the M.S degree from Oregon State University and the Ph.D degrees from the University of Washington, Seattle, all in Electrical Engineering. He is currently Assistant Professor of Electrical and Computer Engineering at Carnegie Mellon University. He has held product development positions with Analog Devices, where he designed highperformance data converters, and Motorola where he designed analog and RF integrated circuits for cellular transceivers. From 2002 to 2004, he was with the Communications Circuit Lab, Intel where he developed multi-antenna receivers, high-efficiency power amplifiers and high-speed data converters high data-rate wireless transceivers. His research interests include the design of RF and mixed-signal integrated circuits and systems for a wide variety of applications.



**David J. Allstot** received the B.S., M.S., and Ph.D. degrees from the Univ. of Portland, Oregon State Univ., and the Univ. of California, Berkeley.

He has held several industrial and academic positions and has been the Boeing-Egtvedt Chair Professor of Engineering at the Univ. of Washington since 1999. He was Chair of the Dept. of Electrical Engineering from 2004 to 2007. He is currently a Visiting Professor of Electrical Engineering at Stanford University.

Dr. Allstot has advised approximately 100 M.S. and Ph.D. graduates, published more than 300 papers, and received several awards for outstanding teaching and graduate advising including the 1980 IEEE W.R.G. Baker Award, 1995 and 2010 IEEE Circuits and Systems Society (CASS) Darlington Award, 1998 IEEE International Solid-State Circuits Conference (ISSCC) Beatrice Winner Award, 1999 IEEE CASS Golden Jubilee Medal, 2004 IEEE CASS Charles A. Desoer Technical Achievement Award, 2005 Semiconductor Research Corp. Aristotle Award, 2008 Semiconductor Industries Assoc. University Research Award, and 2011 IEEE CASS Mac Van Valkenburg Award. His service includes: 1990-93 Assoc. Editor and 1993-95 Editor of IEEE TCAS II. 1990-93 Member of Technical Program Committee of the IEEE CICC Conference, 1992-95 Member, Board of Governors of IEEE CASS, 1994-2004, Member, Technical Program Committee, IEEE ISSCC, 1995-97, 2001, 2003-04, Member, Executive Committee of IEEE ISSCC, 1996-2000 Short Course Chair of IEEE ISSCC, 2000-2001 Distinguished Lecturer, IEEE CASS, 2001 and 2008 Co-General Chair of IEEE ISCAS, 2006-2007 Distinguished Lecturer, IEEE Solid-State Circuits Society and 2009 President of IEEE CASS.