Reconfigurable Digital Front-End Hardware for Wireless ... - IEEE Xplore

16 downloads 4142 Views 2MB Size Report
technology. The architecture includes the digital up-conversion, and peak-to-average power ratio (PAPR) reduction blocks that are applicable to down-link data ...
1666

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 8, AUGUST 2007

Reconfigurable Digital Front-End Hardware for Wireless Base-Station Transmitters: Analysis, Design and FPGA Implementation Navid Lashkarian, Senior Member, IEEE, Ed Hemphill, Helen Tarn, Hemang Parekh, and Chris Dick, Member, IEEE

Abstract—A versatile digital front-end architecture is designed and implemented on field-programmable gate array (FPGA) technology. The architecture includes the digital up-conversion, and peak-to-average power ratio (PAPR) reduction blocks that are applicable to down-link data paths in multi-band wireless base stations such as WCDMA or Wimax systems. Transmitter linearity requirements are addressed and tradeoff analysis for design and optimization of the PAPR reduction algorithm within the context of the error vector magnitude and adjacent channel leakage ratio quality metrics are studied. Statistical characteristics of the clipping noise are analyzed and a novel method for clipping the multi-band signal under the phase invariant constraint is proposed. Our study also includes mapping of the signal processing algorithms onto Xilinx Virtex-4™ FPGA device and addresses the resource utilization and efficient hardware implementation of the above signal processing blocks. Performance assessments and hardware validation of the proposed architecture are also addressed. Index Terms—Adjacent channel leakage ratio (ACLR), crest factor reduction, error vector magnitude (EVM), field-programmable gate array (FPGA), multi-band transmitters, peak-to-average power ratio (PAPR) reduction, reconfigurable transmitter.

I. INTRODUCTION PROFUSION and variety of communication systems, which carry massive amounts of data between terminals and end users of many kinds, exist today. Necessitated by the global compliant requisition, original equipment manufacturers are expected to provide convergent solutions that accommodate various standards within a single embodiment. Primitive solutions may seek to support this necessity by the simple expedient of “stacked” structures, i.e., separate transceivers for different standards. Such systems, however, achieve the desired convergence with the least expendable resources: hardware silicon real estate and product turn over time. This necessity represents a major bottleneck in attempting to achieve higher levels of integration in broadband communication systems. For

A

Manuscript received January 29, 2006; revised September 9, 2007. This work was supported in part by Xilinx Inc. and Arraycomm LLC. This paper was recommended by Associate Editor M. Stan. N. Lashkarian is with the Arraycomm LLC, San Jose, CA 95131 USA (e-mail: [email protected]). E. J. Hemhill, H. Tarn, H. Parekh, and C. Dick are with the DSP Division, Xilinx Inc., San Jose, CA 95124, USA (e-mail: [email protected]; helen. [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/TCSI.2007.902608

instance commercial wireless chip manufacturers often offer a multi-chip solution to encompass the multi-channel digital up-conversion (DUC) (Analog Devices AD6623 or Texas Instrument GC5316) and crest factor reduction (PMC-Sierra PM7819 or Texas Instrument GC1115 processors) processing requirements for the digital front-end of the wireless base stations (BSs). These multi-chip solutions often result in higher integration overhead that translates into higher capital expenses. In contrast, reconfigurable architectures provide flexible and integrated system-on-chip solutions that accommodate smooth migration from archaic to innovative designs, allowing recycling of hardware resources across multiple generations of the standards. Moreover, using this topology, the network providers have the ability to configure the digital front-end based on demand and integrate all the transmit/receive functionalities into a unified and custom-built hardware platform. As opposed to the application-specific integrated circuits or application-specific standard products (ASSP) solutions with fixed support for the carriers per chip-set, users can configure the field-programmable gate array (FPGA) platforms to accommodate arbitrary number of carriers based on demand, reducing the cost per carrier metric. These requirements have created a surge in the development of radio architectures and reconfigurable platforms that support multiple standards for the digital front-end of wireless BSs [13]. The transmit signal in multi-band transmitters is comprised of base-band signals occupying nonoverlapping frequency bands. Often in practice, the base-band signals are generated through adding independent random variables that tend to have Gaussian distribution in the limit. For instance, the test model-1 for the WCDMA transmitters consists of 64 Hadamard sequences with known timing offset and power [9]. The peak-to-average power ratio (PAPR) of this composite sequence can reach values as high as 12 dB. From practical perspectives, PAPR of the signal prior to digital–analog converter (DAC) should lie within the range of 5 to 8 dB to assure reasonable power consumption in the RF and analog components [1]. Previous works on the design of digital front-end for WCDMA systems mostly neglect the high PAPR of the composite signal and little effort is carried out to analyze and integrate the PAPR reduction block into the transmitter data-path [10], [11]. To overcome the high PAPR difficulty in these scenarios, standards (such as 3GPP) usually allow limited distortions of the transmit signal [9]. At expense of adding some distortions to the transmit signal, the PAPR of the signal can be reduced to an

1549-8328/$25.00 © 2007 IEEE

LASHKARIAN et al.: RECONFIGURABLE DIGITAL FRONT-END BS TRANSMITTERS

1667

Fig. 1. Digital front-end architecture for 3-carrier WCDMA transmitter.

admissible range. However, the distorted transmit signal should conform with the spectral emission mask (SEM) of the transmitter [2]. Adjacent channel leakage ratio (ACLR) is a relative measure of spectral leakage into the adjacent channels. More specifically, it reflects the relative power of the signal within the band of interest to that of transmitter induced noise in the neighboring band. On the other hand error vector magnitude (EVM), power to the power the ratio of the transmitter induced noise , of undistorted transmit signal has to be limited by an upper bound. The problem of PAPR reduction has long been a topic of research in digital communications [3], [7]. Specific to the WCDMA systems, Vaananen et al. have proposed PAPR reduction algorithms for WCDMA systems using unused sub-channels [5]. In a similar work, authors propose a peak windowing method for controlling the PAPR [7] that has its roots in prior works published in [6]. Other methods for PAPR reduction include adjusting the timing offset for signals on adjacent carriers, although the PAPR reduction using this dB). approach may not be significant ( None of these methods provide a mathematical basis for analysis of the PAPR reduction problem. Moreover, generalization of the above methods to multi-band transmitters has not been addressed in these literatures. Thus, there is a need to formulate and analyze the PAPR reduction problem within the context of multi-band transmitters that is fully applicable to multi-standard digital front-end architectures and is independent of the transmitter modulation format. This is on the main objectives of this paper. The rest of this paper is organized as follows. In Section II, we state the PAPR reduction problem and discuss the constraint sets that satisfy the transmitter SEM requirements. In Section III, we address the clipping strategies and provide a mathematical framework for analyzing the transmitter noise characteristics within the context of ACLR and EVM performance metrics. A novel approach for noise projection based on least-square metric is also proposed. Section IV addresses the mapping of the proposed methods into the Xilinx high-end Virtex-4 FPGA device. System validation and hardware verification for the proposed system is described in Section V.

II. PROBLEM STATEMENT Fig. 1 depicts the block diagram of a typical digital front-end architecture for a multi-band transistor. The base-band transmit independent complex Gaussian signals signal consists of wherein . The constituent base-band signals are modulated to the corresponding and combined together to form the superimfrequencies posed signal

(1) The objective of the multi-band CFR block is to control the PAPR of this combined signal through adding a clipping . The clipping noise should conform to the following noise constraints. C1: Bounded PAPR metric C2: Bounded in-band energy constraint C3: Phase invariant: preserves the phase information for each constituent signal . C4: Bounded out-of-band spectral leakage . In the above, and are the Fourier transform of the clipping noise and superimposed signal , respecand correspond to the in-band and out-oftively. Also band frequency range respectively. In what follows, we address the above constraints and provide a solution that lies in intersection of the above constraint sets.

III. ANALYSIS Analysis of clipping noise variance for the Cartesian clipping is previously studied in [7]. In what follows, we extend this analysis to the polar clipping and derive the joint and marginal pdf of the clipped signal in these scenarios.

1668

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 8, AUGUST 2007

A. Constraints C1 and C2 A common method for controlling the PAPR is to pass the through a nonlinear function with saturated amplisignal fier characteristic. There are two natural choices for clipping the signal, namely Cartesian and polar clipping. Although both of these methods limit the absolute value of the signal to a certain PAPR, their impacts on the overall EVM is substantially different. To better address this issue, we quantify the statistical characteristics of the clipping noise for these methods as described below; 1) Cartesian Clipping: In this case, the mapping between the original and clipped signal is defined as , wherein is the saturated amplifier characteristics defined as if if if

if otherwise with the variance of the clipped signal being defined as

(2)

The real component of the error signal in Cartesian clipping is computed according to if if otherwise

Also note that the input back-off in the Cartesian case is defined . as 2) Polar Clipping: As opposed to the Cartesian clipping wherein the quantization is done individually on the real and imaginary components, polar clipping quantizes the magnitude of the resultant complex signal. Given complex Gaussian signal , its magnitude, i.e., is Rayleigh distributed with the pdf of for . The quantization operation is defined as in which . Thus, has a truncated Rayleigh distribution as

(3)

Note that the phase of the clipped signal is independent of the in which is magnitude thus . Thus, the joint uniform random variable with span of pdf for the clipped signal is computed according to

The real and imaginary components of the signal are independent and identically distributed with . Similarly, the variance of the clipped signal can be obtained from

(4) The pdf of the clipped signal can be deduced from the pdf of the unclipped signal with a minor modification. Since the clipped signal would not accept values beyond the threshold, namely and , their associated probabilities are being and replaced with two delta functions around as follows

if (5) otherwise Note that the above equation incorporates the tail probability of . The northe Gaussian distribution malized noise variance for the Cartesian clipping can be formulated as

(6)

if (7) otherwise Inherited from the circular symmetric property of we have . Thus, . The marginal probability distribution function of the real and imaginary components of the clipped signal is of the form (refer to Appendix)

(8) Clearly, the random variables and are dependent (but uncorrelated) in this case. Using the above expressions we can also obtain the normalized variance of the clipping noise as

(9) We note that the input back-off in this case is defined as . Fig. 2 depicts the normalized noise variance for both polar . As shown in the and Cartesian clipping as a function of figure, for a given , the normalized noise variance of the

LASHKARIAN et al.: RECONFIGURABLE DIGITAL FRONT-END BS TRANSMITTERS

1669

Fig. 2. Comparison between normalized noise variance in Cartesian and polar clipping.

Cartesian clipping is higher compared to polar clipping. Higher noise power in turn translates into the higher EVM in the transmitter, making the Cartesian clipping a less favorable choice for the systems in which EVM is of major concern. For this reason, throughout the rest of this paper, we focus our analysis on the polar clipping assuming that generalization of the polar clipping results to the Cartesian clipping is straightforward. Having to the signal, the added the clipping noise PAPR of the combined signal becomes

we intend to find projection of the error to the signal such that subspace (11) and . The least-square solution to the above problem is obtained as [4] where

in

(10) (12) We note that of the clipped signal is a monotonically increasing function of . As we will show in the subsequent sections, shaping the noise spectrum would cause the function to have local minima with respect to , destroying this monotonic behavior. Fig. 3 depicts the marginal pdf of the clipped signal as a function of the back-off threshold for polar clipping.

In practice, inverting this matrix is a formidable task and one can use the following approximation to compute the matrix inversion as

Replacing (13) into (12), we obtain

B. Constraint C3 As described above, the clipping noise would limit the peak-to-average ratio of the combined signal. However, since is not necessarily phase aligned with the error component the constituent signals , multi-band components are subject to phase rotation. To overcome this shortcoming, the error signal needs to be projected into the signal subspace. More specifiand the constituent signal components cally, given the error

wherein is the angle between the composite error the th constituent signal .

and

C. Constraint C4 In order to be complied with the constraint C4, the projected error signals have to be filtered with the corresponding

1670

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 8, AUGUST 2007

Fig. 3. Marginal pdf of the polar-clipped signal.

bandpass filters as

tuned to the center frequency of th sub-band

(13) should have the following characThe noise-shaping filters teristics: • on-the-fly tunability; • single side band characteristic to allow spectral allocation for both positive and negative frequencies; • sufficient out-of-band attenuation to achieve higher ACLR. The variance of the clipping noise is thus obtained from . The PAPR of the signal is controlled by adjusting the clipping threshold in the clipping function. The error signal is then projected into the signal subspace to obtain the closest approximation to the error signal that preserves the phase information of each constituent . The projected error components are passed signal through the noise-shaping filters to comply with the spectral emission mask of the transmitter and finally added to the superimposed signal. To obtain a SSB bandpass filter response, the

pass-band error signals are frequency down-converted to the base-band using complex synthesizers, filtered by an efficient interpolated finite-impulse response (IFIR) structure ([14]) and finally up-converted to the corresponding center frequency of each sub-carrier. An upper bound for the PAPR of the final signal is computed is as (14), shown at the bottom of the page, wherein is the aggregate response the input PAPR and of the noise-shaping filters. Several aspects of both the interpretation and implications of the above bound are worth developing in more detail. First, it is important to note that the second term in the numerator corresponds to the destructive effect of the noise-shaping filter. Using the argument given in [8], the successive peaks of the signal have temporal separation larger , in which is the bandwidth of the signal. than Having this in mind, for a noise-shaping filter with sampling frequency of , the maximum error corresponds to sum of the times the maximum error magabsolute value of nitude. As the bandwidth of the signal increases, it causes the sum absolute term to grow, resulting in higher output PAPR. Secondly, as noted in the above equation, the denominator consists of two terms; the first term is the variance of clipped signal

(14)

LASHKARIAN et al.: RECONFIGURABLE DIGITAL FRONT-END BS TRANSMITTERS

1671

Fig. 4. Impact of noise-shaping filter on the output PAPR.

and the second term is the excessive noise variance due to noise as a function of and shaping.Fig. 4 depicts the for an FIR filter with MHz and MHz. D. Design Methodology In this section we summarize the design procedure for ck in multi-band transmitters. System designers are often given the following specifications for designing the CFR block; output , transmitter’s EVM and ACLR parameters. The design methodology consists of the following steps: IV. FPGA IMPLEMENTATION • Step 1) For a given output , obtain the initial for the CFR input from (10). Note that back-off level this back-off does not account for the excessive PAPR due to the noise shaping. Since the excess PAPR is a function of the noise-shaping filter characteristics that is not known is reduced by a margin (0.5 to 1 dB) at this stage, the to account for the excess PAPR due to the filtering • Step 2) From Fig. 2 compute the normalized noise variance ) for a given input back-off level. This value indicates the relative energy of the quantization noise with respect to the signal. Based on this value, estimate the out-of-band dB . filter attenuation using dB • Step 3) Given the filter attenuation dB , and the transition band , select the appropriate order for the noiseshaping filter and design the FIR noise-shaping filter. • Step 4) For the given filter specification, characterize the PAPR based on (14).

to assure the re• Step 5) Re-adjust the input back-off and reiterate through steps 2 to 5 if quired output necessary. System designers are frequently interested in exploiting noiseshaping filters with high stop-band attenuation characteristics for improving the ACLR metric. Sharp transition characteristics in turn results in long impulse responses that may cause peak regrowth due to destructive accumulation of error samples in the filter output. This effect is mainly due to the second term in the numerator of (14). Thus, for aggressive ACLR specifications, the PAPR specification may not be met using single stage and successive stages of clipping followed by noise shaping may be required to achieve the system specifications. The digital front-end architecture described in the previous section is implemented on a Virtex-4 XC4VSX25 Xilinx FPGA device, targeted for a 3-carrier WCDMA BS transmitter with dB, dB, the following specifications; %, digital IF frequency of 46.08 MHz, arbitrary carrier configuration (on/off) with maximum carrier seperation of 10 MHz. In order to obtain a highly compact design, an FPGA clock MHz is assumed across the design. As rate of we will describe shortly, the FPGA clock is integer multiples of least common multiplier of all intermediate sample rates in the DUC chain. The design is implemented using Xilinx System Generator 7.1 and synthesized by Synplify Pro synthesis tool. For place and route, Xilinx ISE Foundation 7.1 is used with the maximum effort level, and clock constraints are carefully tuned to give fastest clock frequency. The design is heavily pipelined to maximize the throughput. The latency of the design is around 26.66 WCDMA chips (6.95 sec.). Xilinx Virtex-4 devices

1672

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 8, AUGUST 2007

TABLE I RESOURCE UTILIZATION TABLE FOR 3-CARRIER WCDMA DIGITAL FRONT-END ARCHITECTURE

have embedded memory elements and versatile DSP architecture blocks, also known as block RAMs and XtremeDSP Slices. XtremeDSP Slices have been custom designed in silicon to achieve 500-MHz performance, supporting over 40 dynamically controlled operating modes including multiplier, multiplier-accumulator, multiplier-adders/subtractor, three input adder, barrel shifter and wide bus multiplexers. Table I summarizes the per block resource utilization for the above mentioned design. The XC4VSX25 FPGA device is characterized by 10240 Slices, 128 of XtremeDSP Slices and 128 of 18-Kbit block RAM blocks for an amount of 2304 Kbits RAM. Based on the resources available on XC4VSX25 FPGA, the device utilization percentile is also reported in this table. As illustrated in the table, XtremeDSP blocks have the highest utilization factor among the existing resources on the device. As tabulated in this table, majority of the resources are utilized by the PAPR reduction block. The input samples are 16 bit quantized WCDMA signals with Mcps. The interpolation chain consample rate of sist of cascaded interpolation stages, each interpolating by factor of two, two and three respectively. Note that the FPGA clock is multiple integer of the intermediate digital IF samples in the . The first stage of indesign terpolation chain consists of a root-raised cosine pulse shaping , as described in [9]. The FIR filter with roll-off of filters have linear phase and are therefore symmetric relative to their center. The interpolation filters are implemented using poly-phase structure for higher efficiency. The poly-phase interpolation (by ) structure reduces the multiplier requirement by a factor of . The decomposition of the filter into sub-bands in certain cases (such as interpolation-by-three) destroys the real symmetric (antisymmetric) property of the filter, adding more overhead to the adder blocks. In-phase and quadrature-phase samples are time interleaved for hardware resource sharing. Due to the high computational complexity associated with the noiseshaping filters, we adopt the semi-parallel FIR structure [12] to implement the filters. The DSP48 slice arithmetic units are designed to be chained together easily and efficiently due to the

TABLE II PASS-BAND, STOP-BAND, ORDER, OUT-OF-BAND ATTENUATION, IN-BAND RIPPLE AND INTERPOLATION FACTOR CHARACTERISTICS OF THE IFIR NOISE SHAPING FILTERS

dedicated routing between slices. The characteristics of the IFIR noise-shaping filter is tabulated in Table II. The model filter with order 18, incorporates the interpolation factor of while the mask filter is just an FIR filter with order 11. As a result, the equivalent cascaded filter has a length of 126 taps resulting in a highly efficient realization of the noise-shaping filter. Fig. 5 provides a detailed view of the semi-parallel FIR filter realization of the mask IFIR filter. The two pipeline registers are used on the B input of DSP48 cells to compensate for the register on the output of the coefficient memory. An extra DSP48 is required on the end to perform the accumulation of the partial results, thus creating the final results. A new result is cycles, the accumulation must created every be reset to the first partial value of the next result. This reset is achieved by changing the OPMODE value of the DSP48 slice from binary 0010010 to binary 0010000. At the same time, the capture register is enabled and the final results is stored at the output. Two SRL16Es are used as data buffers. Each SRL16E holds 6 samples needed for the intermediate results. They are written to once every 6 cycles and the shifting characteristics of the SRL16E is exploited to pass the old samples along the storage buffer. The extra register on the output of the each data buffer is required to match up the data buffer pipeline with extra delay caused by the adder chain. The coefficient are divided up into two groups of six and stored in the appropriate memory bank as shown in Fig. 5. Distributed RAM blocks are used for the coefficient memories. The coefficient width is 18 bits. The control logic provides memory addressing and clock enable sequencing for the semi-parallel structure. The counter creates the

LASHKARIAN et al.: RECONFIGURABLE DIGITAL FRONT-END BS TRANSMITTERS

1673

Fig. 5. Semi-parallel realization of the mask filter.

Fig. 6. Signal flow diagram for DDS.

Fig. 7. Signal flow diagram for the clipping block.

fundamental zero through five counts. Each successive delay is used to address both the coefficient memory and the data buffer of their respective multiply-add elements. A relational operator is used to determine when the count limited counter resets its count. The signal remains High for one clock ever six cycles, to represent the input output data rates. The clock enable (CE) signal is delayed by a single register just like the coefficient address and, each delayed version of the signal is tied to the respective section of the filter. The model filter has a similar structure as the model with one extra DSP48 block to support the extra 6 taps coefficient bank. A direct digital synthesis (DDS) block is used to up-convert the base-band signals to the digital IF frequency. To assure sufficiently high spurious-free dynamic range (SFDR) at the DDS

dB), a dither mechanism is used that improves the output ( SFDR figure by 12 dB. The input word to the phase accumulator of the DDS controls the frequency of the modulating digital IF frequency. The phase value is generated by using the modulo overflowing property of a -bit phase accumulator. The fre. quency resolution of the DDS is obtained from MHz and , Thus, for the sample rate of the frequency raster for the DDS is around 0.0107 Hz. Fig. 6 depicts the data flow diagram of the DDS block. The polar clipping depicted in Fig. 7 constitutes the main eland ement in implementing the PAPR reduction block. The functions are used to project the samples onto the unit-norm vector along the and components. In doing so, an inversesquare root function is tabulated in a look-up table format. The

1674

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 8, AUGUST 2007

Fig. 8. Measurement and validation setup.

input to this table is the sample norm quantized to 13 bit values. The look-up table has entries each with width of 16 bits. Thus, to store the entries of this table in the Block RAMs, BRAMs are used. A a total amount of comparator block compares the signal power against the clipping threshold and selects the appropriate outgoing signal at the output of the MUX. While (14) results in a upper bound for the PAPR, the gap in the upper bound can often be large. As a result, for the purpose of this design, we resort to the statistical simulation to compute the PAPR of the clipped signal. The clipping threshold for the above WCDMA design is set to 4.1 dB. V. EXPERIMENTAL RESULTS A test-bench platform is built to assess the statistical characteristics of the digital front-end hardware. Fig. 8 illustrates the setup used for characterization of the device under test (DUT). The Agilent vector signal generator (ESG-4438C) is configured to produce base-band W-CDMA data that is compliant with 3GPP standard requirements. The composite code domain power for TM1 is illustrated in Fig. 9. In order to avoid any

frequency drift, the three ESG equipments are all locked to the same reference frequency (10 MHz). The same frequency provides reference input to the pulse/pattern generator that generates the clock to DUT at the rate of 267.48 MHz. The N5102 module provides the W-CDMA base-band signals at the rate of 3.84 MHz to the DUT input terminals. The Xilinx AFX-FF1148 prototype board is used to validate the DUT. A host computer with JTAG interface is used to download the bitmap into the FPGA. The DUT output provides up-converted and combined digital IF samples at rate of 46.08 Msps. These samples are captured by the Agilent’s 16900 logic analyzer and subsequently sent to the host computer for data analysis. The agilent VSA software demodulates the received signal and provides the channel quality metrics such as composite EVM and composite code domain error for each individual channel. Fig. 9 depicts the composite EVM table summary for the channel centered at 5 MHz. As shown in the figure the EVM is around 10.88 %, well within the 11% EVM budget. This figure corresponds to composite peak code domain error (PCDE) of dB. According to the 3GPP standard, the EVM budget at

LASHKARIAN et al.: RECONFIGURABLE DIGITAL FRONT-END BS TRANSMITTERS

1675

Fig. 9. Composite CDE and EVM for 3 GPP Test Model 1.

Fig. 10. ACLR and ACP Results for 3GPP Test Model 1.

the RF output of the BS transmitter is 17.5%. Thus, there is a considerable margin for signal degradation in analog/RF chain. Fig. 10 illustrates the power spectral density, ACLR and complementary cumulative distribution function (CCDF) for the digital IF signal. As shown in the figure, the ACLR1 and ACLR2 metrics are 61.1 and 62.8 dB, respectively. Also, as show in the figure, the CCDF has a cutoff rate of 6.0 dB, providing 5-dB improvement over nonclipped signal at 0.01% clipping probability. The experiment is further expanded to include both TM1 and TM3 cases with arbitrary carrier configuration. Table III summarizes the test results for this scenario. The 3-bit carrier configuration indicates the sub-channels configuration, with the active sub-channel being represented with binary digit one and dis-

abled channels with binary digit zero. Number of DPCH channels are set to be 64 and 32 for the TM1 and TM3, respectively. VI. CONCLUSION This paper has developed a viable solution for the digital front-end of multi-band transmitters on a reconfigurable platform. A new method for PAPR reduction is proposed and a mathematical framework for design and analysis of PAPR reduction algorithm for multi-band transmitters is provided. A tradeoff analysis for optimization of transmitter characteristics within the context of EVM and ACLR quality metrics is discussed and a design methodology is developed accordingly. Reconfigurable platforms, as described in this paper, exhibit potential solution for low cost, highly integrated implementation

1676

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 8, AUGUST 2007

TABLE III PERFORMANCE OF THE DIGITAL FRONT-END HARDWARE FOR W-CDMA TM1 AND TM3 SIGNALS AND FOR VARIOUS CARRIER CONFIGURATIONS

of multi-band transmitters. Our study also addresses the technical issues in realization of the PAPR reduction algorithm into Xilinx Virtex-4 FPGA platforms and includes the resource utilization and speed/area tradeoff in implementing the proposed methods. APPENDIX In order to compute the marginal pdf of the clipped signal, we integrate the pdf given in (7) with respect to one of the variables as

(15) The first integral on the right-hand side of (15) can be written as

(16) Using the change of variable integrals become

the above

otherwise (17) The second integral in (15)

otherwise (18) Substituting (17) and (18) into (15) results in (8). ACKNOWLEDGMENT The authors would like to thank Dr. Marvin Simon from Jet Propulsion Laboratory for his insightful comments.

REFERENCES [1] H. Pretl et al., “Linearity considerations of W-CDMA front-ends for UMTS,” in . Dig. IEEE Int. Microw. Symp., Jun. 2000, vol. 1, pp. 433–436. [2] O. Vaananen et al., “Performance of EVM and PCDE quality metrics in WCDMA downlink,” Electro. Lett., vol. 38, no. 22, pp. 1386–1387, Oct. 2002. [3] N. Lashkarian, H. Tarn, and C. Dick, “Peak-to-average power ratio reduction in multi-band transmitters; analysis, design and FPGA implementation,” in Proc. IEEE Global Commun. Conf., Nov. 28–Dec. 2. 2005, vol. 4, pp. 2169–2173. [4] S. Kay, Fundamentals of Statistical Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1993, vol. I. [5] O. Vaananen et al., “Reducing the crest factor of a CDMA downlink signal by adding unused channelization codes,” IEEE Commun. Lett., vol. 6, no. 10, pp. 443–445, Oct. 2002. [6] M. Pauli and P. Kuchenbecker, “On the reduction of the out-of-band radiation of OFDM-signals,” in Proc. IEEE Int. Conf. Commun., Jun. 1998, vol. 3, pp. 1304–1308. [7] O. Vaananen, J. Vankka, and K. Halonen, “Effect of baseband clipping in wideband CDMA system,” in Proc. IEEE Int. Symp. Spread Spectrum Tech. Appl., Sep. 2002, vol. 2, pp. 445–449. [8] H. Ochiai and H. Imai, “On the Distribution of the Peak-to-Average Ratio in OFDM Signals,” IEEE Trans. Commun., vol. 49, no. 2, pp. 282–289, Feb. 2001. [9] 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Base Station conformance testing, 3GPP TS 25.141, Jun. 2007 [Online]. Available: www.3gpp.org. [10] M. Kosunen, J. Vankka, M. Waltari, and K. A. I. Halonen, “A multicarrier QAM modulator for WCDMA base-station with on-chip D/A converter,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 2, pp. 181–190, Feb. 2005. [11] J. Vankka, M. Kosunen, M. Sanchis, and K. A. I. Halonen, “A multicarrier QAM modulator,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 47, no. 1, pp. 1–10, Jan. 2000. [12] N. Battson, High-Performance DSP Using Virtex-4 FPGAs. San Jose, CA: Xilinx Xcell Publications, 2005 [Online]. Available: http://www. xilinx.com/publications/books/dsp/index.htm [13] S. C. Chan, K. M. Tsui, K. S. Yeung, and T. I. Yuk, “Design and complexity optimization of a new digital if for software radio receivers with prescribed output accuracy,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 2, pp. 351–366, Feb. 2007. [14] P. P. Vaidyanathan, Multirate Systems And Filter Banks. Englewood Cliffs, NJ: Prentice-Hall, 1993. Navid Lashkarian (S’98–M’00–SM’06) received the B.Sc. and M.Sc. degrees from the University of Tehran, Iran, and the Ph.D. degree from Oregon State University, Portland, all in electrical engineering, in 1990, 1992, and 1999 respectively. From 1992 to 1995, he was with the Faculty of Electrical and Computer Engineering, the University of Tehran. He has done fundamental contributions to the design and development of the broadband wireless/wireline communication systems, such as asymmetric digital subscriber line (ADSL) modems and 3GPP WCDMA base-station and handset devices, at Motorola Inc., Centillium Communications, National Semiconductor and Xilinx Inc. Currently, he is the Principal System Architect with Arraycomm LLC, San Jose, CA, where he leads the design and development of adaptive multi-antenna array processing technology for the next generation broadband Wimax systems. His research

LASHKARIAN et al.: RECONFIGURABLE DIGITAL FRONT-END BS TRANSMITTERS

interests include statistical estimation and detection theory with applications in wireless communications and finance theory. He holds five U.S. and International patents and has published several publications in the areas of ADSL equalization, orthogonal frequency-division multiplexing synchronization and robust signal-to-noise ratio estimation in wireless fading channels. Dr. Lashkarian has served as the member of the Technical Program Committee of the IEEE Global Telecommunications Conference (GLOBECOM) and International Conference on Communications (ICC) conferences since 2000.

Edwin J. Hemphill received the B.S.E.E. degree from the University of Utah, Salt Lake City, in 1990, and the M.S.E.E degree from the University of California, San Diego, in 1994. From 1990 to 1994, he worked for Naval Ocean Systems Center, San Diego, CA, doing research in signal processing algorithms for undersea surveillance. From 1994 to 2000, he was with L-3 Communications, Salt Lake City, UT, working on the design, implementation, and testing of high data rate modems for satellite communications. He is currently with Xilinx Inc., San Jose, CA, where he has been involved in the development of high-performance turbo decoders and reference designs for digital radio.

Helen Tarn received the B.S. degree from National Tsing Hua University, Hsinchu, Taiwan, R.O.C., in 1998, and the M.S. degree from the University of Michigan, Ann Arbor, in 2000, both in electrical engineering. In 2001, she joined Xilinx Inc., San Jose, CA, where she is currently a Staff Systems Design Engineer in the DSP Division. During her tenure at Xilinx, she worked in the areas of developing and implementing fast algorithms for various signal processing and digital communication applications on reconfigurable platforms. Her current focus is on high-performance digital front-end design for broadband wireless systems. She has authored and coauthored multiple technical papers and has six U.S. patents pending.

1677

Hemang Parekh received the Bachelor’s degree in electronics from the Maharaja Sayajirao University, Baroda, India, in 1998, and the M.S. degree in computer engineering from University of Kansas, Lawrence, in 2000. Since then, he has worked for Xilinx Inc., San Jose, CA, as a Signal Processing Engineer providing Intellectual Property solutions in wired and wireless communications using field-programmable gate arrays.

Chris Dick (M’04) received the Bachelor’s and Ph.D. degrees in the computer science and electronic engineering from La Trobe University, Melbourne, Australia, in 1986 and 1996, respectively. He is the DSP Chief Scientist at Xilinx Inc., CA. He has worked with signal processing technology for two decades and his work has spanned the commercial, military and academic sectors. Prior to joining Xilinx in 1997, he was a professor at La Trobe University, Melbourne, Australia for 13 years and managed a DSP Consultancy called Signal Processing Solutions. He has been an invited speaker at many international signal processing symposiums and workshops and has authored more than 80 journal and conference publications, including many papers in the fields of parallel computing, inverse synthetic aperture radar (ISAR), field-programmable gate array (FPGA) implementation of wireless communication system PHYs and the use of FPGA custom computing. His work and research interests are in the areas of fast algorithms for signal processing, digital communication, software defined radios, VLSI architectures for DSP, adaptive signal processing, synchronization, hardware architectures for real-time signal processing, and the use of FPGAs for custom computing machines and real-time signal processing.