Hybridly-Connected Structure for Hybrid Beamforming in ... - IEEE Xplore

0 downloads 0 Views 566KB Size Report
(mmWave) massive MIMO systems, where the antenna arrays at the transmitter and ... paid to the use of mmWave communications for backhaul. Manuscript ...
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 1

Hybridly-Connected Structure for Hybrid Beamforming in mmWave Massive MIMO Systems Didi Zhang, Yafeng Wang, Senior Member, IEEE, Xuehua Li, and Wei Xiang, Senior Member, IEEE

Abstract—In this work, we propose a hybridly-connected structure for hybrid beamforming (HBF) in millimeter-wave (mmWave) massive MIMO systems, where the antenna arrays at the transmitter and receiver consist of multiple sub-arrays, each of which connects to multiple radio frequency (RF) chains, and each RF chain connects to all the antennas corresponding to the sub-array. In this structure, through successive interference cancelation (SIC), we decompose the precoding matrix optimization problem into multiple precoding sub-matrix optimization problems. Then, near-optimal hybrid digital and analog precoders are designed through factorizing the precoding sub-matrix for each sub-array. Furthermore, we compare the performance of the proposed hybridly-connected structure with the existing fullyand partially-connected structures in terms of spectral efficiency, the required number of phase shifters, and energy efficiency. Finally, simulation results are presented to demonstrate that the spectral efficiency of the hybridly-connected structure is better than that of the partially-connected structure and that its spectral efficiency can approach that of the fully-connected structure with the increase in the number of RF chains. Moreover, the proposed algorithm for the hybridly-connected structure is capable of achieving higher energy efficiency than existing algorithms for the fully- and partially-connected structures. Index Terms—MIMO, hybrid precoding, millimeter wave communications, spectral efficiency, energy efficiency.

I. I NTRODUCTION N order to meet the dramatic improvements in capacity for the upcoming fifth generation (5G) system, new emerging wireless techniques have been widely investigated such as physical layer techniques [1], [2], network densification [3], [4], and so on. Nevertheless, the problem of spectrum scarcity in current cellular systems become more and more severe. As such, the allocation of new spectral resources for commercial wireless systems is of paramount importance [5], [6]. Millimeter-wave (mmWave) communications around and above 30 GHz can achieve multigigabit data rates for indoor communications [7]. More recently, attention has been paid to the use of mmWave communications for backhaul

I

Manuscript received February 15, 2017; revised June 30, 2017 and September 7, 2017; accepted September 15, 2017. This work was supported in part by the Ministry of Education and China Mobile Joint Scientific Research Fund under Grant No. MCM20150101, in part by the National Natural Sciences Foundation of China under Grant 61628102. The associate editor coordinating the review of this paper and approving it for publication was V. Raghavan. (Corresponding author: Yafeng Wang and Wei Xiang) Didi Zhang and Yafeng Wang are with the School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China (e-mail: [email protected]; [email protected]). Xuehua Li is with the School of Information and Communication Engineering, Beijing Information Science and Technology University, Beijing 100192, China (e-mail: [email protected]). Wei Xiang is with the College of Science and Engineering, James Cook University, Cairns, QLD 4878, Australia (e-mail: [email protected]).

networking between cells and mobile access within a cell [8], [9]. Meanwhile, advances in electronic components and larger unlicensed spectra motivate the wireless industry to consider mmWave as a prime candidate for outdoor cellular communications in 5G systems [10]-[12]. However, compared with current cellular systems, mmWave communications have much higher carrier frequencies. It is well known that the higher the carrier frequency is, the higher the propagation path loss is experienced in wireless communications [13]. Fortunately, the decrease in the wavelength of mmWave makes it possible to place a very large number of antennas in a much smaller region. Large antenna arrays are able to provide highly directional beamforming gains via precoding, which helps overcome the propagation path loss and increase the link reliability. Moreover, larger antenna arrays can transmit multiple streams via spatial multiplexing, which helps improve spectral efficiency. However, with the increase in the number of antennas at the transmitter, the number of radio frequency (RF) chains required by fully digital beamforming (DBF) is equal to the number of antennas, which is unrealistic in terms of cost, complexity, thermal overshoot, and implementation within a small form factor at the UE side. As such hybrid beamforming (HBF) is considered as a promising solution to reduce the number of required RF chains [14], [15]. To date, researchers’ attention has focused on two hybrid beamforming structures, namely, the fully-connected structure [16]-[22] and the partially-connected structure [23]-[28], as shown in Fig. 1(a) and Fig. 1(b), respectively. In [17], the orthogonal matching pursuit (OMP) algorithm is proposed to design the analog precoder by choosing each column of the analog precoding matrix from the candidate array response vectors. Therefore, design of the OMP-based hybrid analog precoder can be viewed as a spatially sparse precoding problem, which implies that increasing spatial resolution helps improve the spectral efficiency of the system [20]. It is evident that the computational complexity is proportional to the spatial resolution. Hence, research has shifted to reduce the computational complexity of the OMP method [19], [22]. In consideration of the hardware implementation complexity of the fully-connected structure, the partially-connected structure is proposed in [23]. In this structure, the analog precoding matrix is block diagonal, and each block corresponds to the precoding of a sub-array, resulting in independent precoding for each sub-array. Leveraging this property, a hybrid digital and analog precoding based success interference cancellation (SIC) structure is proposed in [24], [25]. Most precoder designs in the above algorithms are based on singular vectors. Given the singular vectors based precoder structure is sensitive

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 2

to small changes in path length [29], [30], the directional beamforming method is proposed in [30]-[33]. However, these algorithms incur some performance loss as opposed to the algorithm based on singular vectors. More recently, attention has shifted to energy efficiency of different HBF structures in mmWave MIMO systems [25], [28]. In [25] and [28], the spectral efficiency and energy efficiency are compared between different HBF structures. Those comparison results show that the partially-connected structure outperforms the fully-connected structure in energy efficiency when the number of RF chains satisfies certain conditions, and underperforms in spectral efficiency. These comparison results motivate us to design a tradeoff between spectral efficiency and energy efficiency for the hybrid digital and analog precoders. Then, we propose the hybridly-connected structure, where the design degrees of freedom are much more flexible in the analog domain. Moreover, near-optimal precoder designs are presented in Section IV. Our major contributions are summarized as follows: •





We propose a hybridly-connected structure for mmWave massive MIMO systems, which features a lower hardware complexity compared with the fully-connected structure. On the other hand, the design degrees of freedom in the analog domain are more flexible in comparison with the partially-connected structure; For the proposed hybridly-connected structure, we propose a matrix factorization based near-optimal design for the hybrid digital and analog precoders. Meanwhile, the proposed algorithm is also applicable to the fully- and partially-connected structures, since those two structures are special cases of their hybridly-connected counterpart; For the proposed hybridly-connected structure, we also study the effects of various parameters such as the number of sub-arrays, RF chains on the performance of our design. Simulation results demonstrate the superiority of the proposed design.

It is worth noting that to the best of the authors’ knowledge, our work is the first to consider the design of HBF using the hybridly-connected structure in mmWave MIMO systems. The remainder of the paper is organized as follows. The system model of the mmWave massive MIMO system is presented in Section II. The hybrid precoding matrix in the hybridly-connected structure is elaborated in Section III. The design of constraint hybrid digital and analog precoders is presented in Section IV. Simulation results on the spectral efficiency and energy efficiency of various HBF structures and the sensitivity of the proposed algorithm to channel estimation errors are presented in Section V. Finally, concluding remarks are drawn in Section VI. Notations: Upper-case bold and lower-case bold letters represent matrices and vectors, respectively. (·)T , (·)H , (·)† , and (·)−1 refer to the transpose, conjugate transpose, pseudoinverse, inversion of a matrix, respectively. |·| is taken to mean the determinant of a matrix, and ∥·∥F indicates the Frobenius norm of a matrix.

II. S YSTEM M ODEL In this section, three structures for hybrid precoding in mmWave MIMO systems are illustrated in Fig. 1. The fullyconnected structure is shown in Fig. 1(a), where the transmitter is equipped with NR RF chains and Nt antennas. The partially-connected structure is presented in Fig. 1(b), where the transmitter is equipped with NR RF chains and NR M antennas, and each sub-array is connected to one RF chain. The hybridly-connected structure is illustrated in Fig.1 (c), where the transmitter is equipped with D sub-arrays, SD RF chains and DN antennas, and each sub-array is connected to S RF chains. Moreover, the number of RF chains at the transmitter satisfies Ns ≤ SD ≤ Nt . In order to assess the performance of these hybrid precoding structures, it is assumed that all the transmitters have the same number of antennas (i.e., Nt = NR M = DN ), and transmit Ns independent data streams to the receiver with Nr receive antennas. To date, research efforts have focused on the fully- and partially-connected structures, whereas the hybridly-connected structure has not been studied in the literature. This paper aims to investigate hybridly-connected structure. In the hybridly-connected structure, the digital precoder FB first assigns Ns independent data streams to different RF chains, then the analog precoder FR maps the processed signals to the corresponding sub-array antennas. Moreover, FB is an SD × Ns matrix, while FR is of dimension Nt × SD. Therefore, the received signal vector y=[y1 , y2 , · · · , yNr ]T at the receiver can be written as √ √ (1) y = ρHFR FB s + n = ρHFs + n, where ρ indicates the average received power, H ∈ CNr ×Nt stands for the[ channel ] matrix, s is the Ns ×1 data stream vectors such that E ssH = N1s INs , n is the noise vector, that follows a complex Gaussian distribution, i.e., n ∈ CN (0, σ 2 ), and 2 the hybrid precoders FR FB should satisfy ∥FR FB ∥F ≤ Ns to meet the total transmit power constraint. In this paper, we assume that the channel state information (CSI) is perfectly known at both the transmitter and receiver. In practical systems, CSI at the receiver can be obtained via channel estimation [34]-[36], and is timely shared at the transmitter via an effective feedback strategy [37]-[39]. Considering the mmWave channels have a sparse scattering structure [40]-[42], in this paper we adopt a narrow-band channel model, i.e., the extended Saleh-Valenzuela model [43]. The channel matrix H can be expressed as H=γ

L ∑

αl ar (ϕr,l )at (ϕt,l )H ,

(2)

l=1

√ Nt Nr where γ = is a normalization factor that satisfies L [ ] 2 ∥H∥F = Nt Nr , L represents the number of scattering paths, and αl stands for the complex gain of the l-th path. We assume that αl follows the complex Gaussian distribution CN (0, 1). Finally, at (ϕt,l ) and ar (ϕr,l ) are the antenna array response vectors at the transmitter and receiver, where ϕr,l and ϕt,l are the l-th path’s azimuth angle of arrival (AoA) and angle of departure (AoD), respectively. For the uniform linear

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 3

Г

Г ͳ

Г

Г

S

Г

E

7[5)&KDLQV



Г

D 6XEDUUD\ 5)FKDLQ

3$ 7[5)&KDLQV

)5

 VW

S

S

Г

)%

Г



'LJLWDO 3UHFRGHU

ͳVW 6XEDUUD\ 5)FKDLQ

ͳ

Г

D

ͳ



S

Г

3$ 7[5)&KDLQV



Г

N 5) FKDLQ

Г

VW 5

Г

Г



)5

ͳ

Г

'LJLWDO 3UHFRGHU

)%

VW  5

N 5) FKDLQ



Г

)%

)5

Г

'LJLWDO 3UHFRGHU

VW5) FKDLQ

ʹ

Г

Г



Г

VW5) FKDLQ

ͳ



3$

F

Fig. 1: Three HBF structures in mmWave MIMO systems: (a) Fully-connected structure, where each RF chain is connected to all BS antennas; (b) Partially-connected structure, where each sub-array is connected to only a single RF chain; and (c) Hybridly-connected structure, where each sub-array is connected to multiple RF chains, and each RF chain is connected to all the antennas corresponding to the sub-array in question.

array (ULA) the elevation AoA and AoD are π2 , which can be ignored. It is assumed that the antennas of ULA are deployed along y-axis at the transmitter and receiver, the array steering vectors at (ϕt,l ) and ar (ϕr,l ) are given as [44] 1 at (ϕt,l ) = √ [1, ejkdt sin(ϕt,l ) ,. . ., ejkdt (Nt −1)sin(ϕt,l ) ]T , Nt (3) 1 ar (ϕr,l ) = √ [1, ejkdr sin(ϕr,l ) ,. . ., ejkdr (Nr −1)sin(ϕr,l ) ]T , Nr (4) where k = 2π , λ is the signal wavelength, and d and dr t λ indicate the spacing of two adjacent ULA elements at the transmitter and receiver, respectively.

III. H YBRID P RECODING M ATRIX FOR THE H YBRIDLY- CONNECTED S TRUCTURE This section elaborates on the structure of the hybrid precoding matrix F in the hybridly-connected structure as shown in Fig. 1(c). It is known that the size of the hybrid precoding matrix F depends on the number of transmit data streams Ns and the number of antennas Nt at the transmitter. When both Ns and Nt are a priori known, the dimension of the hybrid precoding matrix F is fixed, i.e., F ∈ CNt ×Ns . For the hybridly-connected structure, the analog precoding matrix FR is a block diagonal matrix, which means hybrid analog precoding is independent for each sub-array. Moreover, the digital precoding matrix FB is an SD × Ns matrix, whose numbers of rows and columns correspond to the numbers of RF chains and streams, respectively. Therefore, the mapping relationship between the streams and ith sub-array depends on the ith S × Ns sub-matrix of the digital precoding matrix. It can be inferred from F = FR FB that the precoding for the ith sub-array depends on the ith N × Ns sub-matrix of the

hybrid precoding matrix F. Therefore, F can be expressed as   f11 f12 · · · f1Ns  f21 f22 · · · f2Ns    F=  . (5) .. ..  ,  .. . ··· .  fD1 fD2 · · · fDNs where fij ∈ CN ×1 indicates the precoding for the jth stream transmitted by the ith sub-array. If fij =0, this implies the jth stream is not transmitted by the ith sub-array. Thus, the precoding for the ith sub-array can be expressed as Fi,sub = [fi1 , fi2 , · · · , fiNs ], and the precoding for the jth [ T T ] T T stream can be represented by F(:,j) = f1j , f2j , · · · , fDj . Based on the above discussion, the structure of the hybrid precoding matrix F in the hybridly-connected structure depends on three factors, namely the number of RF chains of each sub-array, the number of sub-arrays, and the mapping method of the streams in such a structure. In order to simplify the subsequent expression, we put forward two assumptions for the allocation of the streams to simplify the structure of F: Assumption 1: The number of data streams transmitted by each sub-array is equal to the number of RF chains connected by each sub-array; and Assumption 2: The data streams are allocated to adjacent sub-arrays on a priority basis, which means each data stream is transmitted by adjacent sub-arrays. It is well known that a digital precoder can adjust the number of data streams transmitted by each sub-array according to the actual requirements of the system. Denote by Nsi the number of data streams transmitted by the ith sub-array. Next we use two specific cases to elaborate on the rationality of Assumption 1. Case 1: S=Ns : In this case, the actually required number of RF chains for each sub-array is equal to the number of data streams allocated by the digital precoder. This is because when the number of data streams transmitted by each subarray is constant, increasing the number of RF chains of each

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 4

sub-array has little effect on spectral efficiency, but will reduce the energy efficiency of the system. Case 2: S Ns . In this case, the hybrid digital precoder FB is an irregular matrix. Then, since F = FR FB , the structure of matrix F is similar to matrix FB in column, which can also be divided into two cases. When Ns = DS, F is a block diagonal matrix. When DS > Ns , F is an irregular matrix. To further illustrate the structure of the hybrid precoder F in the hybridly-connected structure, we assume Ns = 8, and a specific example is given as follows. 1) Case 1: Ns = DS: when Ns = DS = 8, we have D ∈ {1, 2, 4, 8}. Note that when D = 1 and S = 8, the hybridly-connected structure is the same as the fullyconnected structure. Moreover, when D = 8 and S = 1, the hybridly-connected structure is identical to the partiallyconnected structure. That is, the fully- and partially-connected structures are special cases of the hybridly-connected structure. Therefore, we focus on the situation of D = 2 and D = 4. As discussed above, we know that when D = 2 and D = 4, the hybrid digital and analog precoders FB and FR are both block diagonal matrices. For example, if D = 4, then we have S = 2. That is, each sub-array transmits two independent data streams. The hybrid digital precoding matrix can be written as   fB,11 fB,12 0 0 0 0 0 0  0 0 fB,23 fB,24 0 0 0 0  , FB =  0 0 0 0 fB,35 fB,36 0 0  0 0 0 0 0 0 fB,47 fB,48 (6) where fB,ij ∈ C2×1 . Since F = FR FB , we know that F is also a block diagonal matrix as shown in Fig. 2(a), where fij = fR,i fB,ij and fij ∈ CN ×1 .

2) Case 2: Ns < DS: For this case, we assume D = 4 and S ∈ {3, 4, 5, 6, 7, 8}. In this situation, it is known from the above discussions that the analog precoder FR is a block diagonal matrix, i.e., FR = diag(FR,1 , FR,2 , FR,3 , FR,4 ), and the digital precoder FB is an irregular matrix. It can be inferred from F = FR FB that F is an irregular matrix. For example, if S = 6, meaning each sub-array transmits six independent data streams, one example of the digital precoding matrix FB is given as   fB,11 fB,12 fB,13 fB,14 fB,15 fB,16 0 0  fB,21 fB,22 fB,23 fB,24 fB,25 fB,26 0 0  , FB =  0 0 fB,33 fB,34 fB,35 fB,36 fB,37 fB,38  0 0 fB,43 fB,44 fB,45 fB,46 fB,47 fB,48 (7) where fB,ij ∈ C6×1 . According to F = FR FB , the structure of F is shown in Fig. 2(c). Moreover, when S = 4 and S = 8, the structure of are shown in Fig. 2(b) and Fig. 2(d), respectively. IV. D IGITAL AND A NALOG P RECODERS D ESIGN FOR H YBRIDLY- CONNECTED S TRUCTURE In this section, we describe the design process of the digital and analog precoding matrices FB and FR for the hybridlyconnected structure as shown in Fig. 1(c). Firstly, according to the structure of the hybrid precoding matrix, we consider obtaining unconstrained hybrid precoding matrix using SIC. Then, we design the hybrid digital and analog precoding matrices for each sub-array according to the corresponding factorization of the hybrid precoding sub-matrix. A. SIC-based Unconstrained Hybrid Precoding Matrix Design In this subsection, we design the unconstrained hybrid precoding matrix F through maximizing the total achievable rate of the mmWave MIMO system. As discussed in Section II, the total achievable rate of the hybridly-connected mmWave MIMO system as shown in Fig. 1(c) can be expressed as ) ( ρ H H C = log2 INr + HFF H (8) , 2 Ns σ where the hybrid precoding matrix F is attainable through solving ) ( ρ H H Fopt = arg max log2 INr + HFF H . (9) 2 Ns σ F As discussed in Section III, F is a block diagonal matrix if Ns = DS, while F is an irregular matrix if Ns < DS. 2 Meanwhile, the inequality of ∥F∥F ≤ Ns should be satisfied to meet the total transmit power constraint. Unfortunately, these non-convex constraints make the optimal problem (9) computationally intractable to solve. Since each column of F represents the precoding vector of one stream, we consider decomposing the optimization problem (9) into multiple sub-rate optimization problems. Moreover, considering that different streams are transmitted by the same multiple sub-arrays, we decompose F into multiple sub-matrices. For example, when D = 4 and S = 6, the structure F is shown in Fig. 2(c). As can be observed from the figure, the first and second streams

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 5

ªI I « «ᴺ  «  «   ¬

 I  







 I  



 D S

ª I I «I I «  ᴺ  «  «   ¬



  ᴼ I I 

 º  » » ᴽ  » » I  I  ¼

 



I I 



I I 





 I





I 

E S

I

I

I

I



I 

I 

I 



I

I

I

I

I 

I 

I 

I 

I 

I 

F  S

I I 



I 



ª I « «I  « « ¬

º » » ᴼ I » » I  ¼



ª I «I «  «I « ¬I 





  ᴻ I I I 

I 



I

I

I

I

I

I

I 

I 

I 

I 

I 

I 

I

I

I

I

I

I

I 

I 

I 

I 

I 

I 



º » » I » » I  ¼

G S

I º I  » » I » » I  ¼



Fig. 2: Structure and partition of the hybrid precoding matrix F for the hybridly-connected structure, where D = 4 and S = 2, 4, 6, 8.

are only transmitted by the first and second sub-arrays, the third to sixth streams are transmitted by all the sub-arrays, and the seventh and eighth streams are only transmitted by the third and fourth sub-arrays. Therefore, matrix F can be partitioned into three sub-matrices as shown in Fig. 2(c). Moreover, the partition results of the hybrid precoding matrix for S = (2, 4, 8) are presented in Fig. 2(a), (b) and (d). According to the above discussions, we divide the hybrid precoding matrix into K sub-matrices F = [F1 , F2 , ..., FK ], and Fk ∈ CNt ×sk , where sk is the number of streams precoded by the [kth sub-matrix. Then F can be further ] ˆ K−1 FK , where F ˆ K−1 denotes the first expressed as F = F K −1 sub-matrices of F. Furthermore, the total achievable rate C in (8) can be rewritten as ) ( ρ H H C = log2 INr + HFF H 2 Ns σ ) ( [ ][ ]H ρ H ˆ ˆ H F F F F H = log2 INr + K−1 K K−1 K Ns σ 2 ( ) ) ρ ( H H H H ˆ ˆ HF F H +H F = log2 INr+ F H K K K−1 K−1 Ns σ 2 ) ( ρ (a) H H R−1 = log2 (|RK−1 |)+log2 INr+ K−1 HFK FK H 2 Ns σ ) ( K ∑ ρ (b) −1 H H = log2 INr + R HFk Fk H Ns σ 2 k−1 k=1

( K (c)∑ = log2 Isk + k=1

) ρ H H −1 F H R HF k , k k−1 2 Ns σ (10)

ˆ k−1 F ˆ H HH . Step where R0 =Is1 , and Rk−1 =INr + Nsρσ2 HF k−1 (a) is obtained due to the fact that |XY| = |X| |Y| and let H H X=RK−1 and Y=INr + Nsρσ2 R−1 K−1 HFK FK H . Note that the form of log2 (|RK−1 |) is similar to (8), and that we can use the similar method in (8) to decompose it. Therefore, step (b) is the result of K − 1 iterations. Step (c) is obtained due to the fact that |I + XY| = |I + YX|, where X=R−1 k−1 HFk H and Y=FH H . k As can be observed from (10), the total achievable rate is the sum of the sub-rates of all the streams, which implies that the precoding matrix optimization problem (9) can be decomposed into a series of precoding sub-matrix optimization problems. Motivated by the idea of SIC, we can first optimize

the achievable sub-rate of the first s1 streams and update matrix R2 , This means that the phase shifters connected by the RF chains corresponding to the first s1 streams are computed and the effect of the first s1 streams on the other streams can be nullified by R2 . Then, similar operations can be performed to optimize the achievable sub-rate of the next s2 streams and update matrix R3 . Repeat this process until the last sK streams are considered. It follows from (10) that the precoding submatrix Fk can be obtained by solving ) ( ρ opt H , (11) F Q F Fk = arg max log2 Isk + k−1 k k Ns σ 2 Fk where Qk−1 = HH R−1 k−1 H is a Nt × Nt Hermitian matrix. We assume that the sk streams precoded by Fk are transmitted by the nk th to mk th sub-arrays. Then, the[ hybrid ]precoding T ˜ T 0 , where sub-matrix Fk can be written as Fk = 0 F k

˜ k ∈ C(mk −nk +1)N ×sk . Therefore, the optimization problem F of (11) can be transformed into ) ( ρ ˜H ˜ opt , (12) ˜ ˜ Fk = arg max log2 Isk + F Q F k−1 k k Ns σ 2 ˜k F ˜ k−1 is a (mk −nk +1)N ×(mk −nk +1)N Hermitian where Q matrix formed as a sub-matrix of matrix Qk−1 by taking the ((nk − 1) N + 1)th row and column to the (mk N )th row and column of Qk−1 . Let us define the singular value ˜ k−1 as Q ˜ k−1 = UΣUH , decomposition (SVD) of matrix Q where U is a unitary matrix, Σ is a diagonal matrix of the singular values arranged in decreasing order. Then, matrices Σ and U can be partitioned as [ ] Σ1 0 Σ= , U = [U1 U2 ] , (13) 0 Σ2 where Σ1 is an sk × sk diagonal matrix, and U1 is an (mk − nk + 1)N × sk matrix. Therefore, the optimal unconstrained precoding matrix of (12) can be expressed as ˜ opt = U1 . F k

(14)

It is known from (14) that the optimal method described above for deriving the kth precoding sub-matrix is reusable to optimize the (k + 1)th precoding sub-matrix. Then, we can obtain Fopt through K iterations. ˜k As mentioned above, we need to obtain the sub-matrix Q H −1 by updating Qk = H Rk H in each iteration, which is very

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 6

complicated. This is because that this process involves the multiplication and inversion of a large scale matrix. Next, we ˜ k . Firstly, Rk can be focus on simplifying the calculation of Q further expressed as ρ H ˆkF ˆH Rk =INr + HF k H Ns σ 2 ]H [ ][ ρ ˆ k−1 Fk HH ˆ k−1 Fk F =INr + H F Ns σ 2 ( ) (15) ρ H H H ˆ k−1 F ˆH H F H +HF F H =INr + k k−1 k Ns σ 2 ρ H =Rk−1 + HFk FH k H . Ns σ 2 Since Rk−1 is a diagonalized matrix, it follows from the Sherman-Morrison formula (Eq 2.1.4) [45] that ( )−1 ( )−1 T −1 Z+XYT =Z−1 −Z−1 X I+YT Z−1 X Y Z , (16) R−1 k can be written as )−1 ( ρ H H R−1 H = R + HF F k−1 k k k Ns σ 2 ρ = R−1 R−1 HFk k−1 − Ns σ 2 k−1 ( )−1 ρ H H −1 H −1 I+ F H R HF FH k k H Rk−1 . k−1 Ns σ 2 k (17) Therefore, Qk can be expressed as Qk =HH R−1 k H

ρ = Qk−1 − Qk−1 Fk Ns σ 2 ( )−1 ρ H I+ F Qk−1 Fk FH k Qk−1 . Ns σ 2 k

Algorithm 1 SIC-based Optimal Hybrid Precoding Algorithm Require: H, N , and Ns Initialization: R0 = Is1 , Q0 =HH H ˜ 0 =Q0 ((nk −1) N +1 : mk N, (nk −1) N +1 : mk N ) 1: Q 2: for 1 ≤ k ≤ K do ˜ k−1=UΣU, U=[U1 U2 ], U1∈C(mk −nk +1)N×sk 3: SVD: Q opt ˜ 4: Fk = U1 ˜ opt 5: Fk ((nk − 1)N + 1 : mk N, 1 : sk ) = F k) ( −1 ˜k = Q ˜ k−1 − ρ 2 U1 Σ1 I + ρ 2 Σ1 6: Q Σ1 UH 1 Ns σ

Ns σ

end for Return: F 7:

FR that satisfies the constant modulus constraints. Thus, it is assumed that the constrained precoding matrix G can be written as G = FR FB , which is similar to F and can also be partitioned into K sub-matrices as discussed in Section IV-A, i.e., G = [G1 , G2 , ..., GK ]. Moreover, Gk can be [ ]T ˜ T 0 , where G ˜ k ∈ C(mk −nk +1)N ×sk written as Gk = 0 G k

˜ k , which can be obtained by solving is sufficiently close to F ) ( ρ ˜H ˜ opt ˜ ˜ Gk = arg max log2 Isk + Gk Qk−1 Gk . (20) 2 Ns σ ˜k G Similar to the optimization problem of (12), we know that the ˜ k . Then, optimal unconstrained precoding matrix of (20) is F the optimization problem of (20) is equivalent to [17]

2 ˜ opt = arg min ˜k − G ˜ k G F

. (21) k F

˜k G

(18)

˜ k is a sub-matrix of Qk , Q ˜ k can be written as Since Q ˜ k =Q ˜ k−1 − ρ Q ˜ k−1 F ˜k Q Ns σ 2 ( )−1 ρ ˜H ˜ ˜ ˜H ˜ I+ F Q F F k−1 k (19) k Qk−1 Ns σ 2 k ( )−1 (a) ˜ k−1 − ρ Σ21 U1 I+ ρ Σ1 =Q UH 1 . Ns σ 2 Ns σ 2 ˜ k−1 F ˜ k =U1 Σ1 . It where (a) is true based on the fact that Q is observed from this result that the complexity stems mainly from U1 UH 1 , which significantly simplifies the multiplication of the matrix and does not need to invert the large scale matrix. Finally, the pseudo-code of the iterative process to obtain all the precoding sub-matrices is given in Algorithm 1. B. Constrained Precoders Design via Matrix Factorization In practice, considering the equivalent isotropically radiated power (EIRP) constraints, the antenna arrays at the transmitter and receiver are often controlled by a common power amplifier (PA). Therefore, the analog precoding matrix needs to meet constant modulus constraint. However, according to F as discussed in Section IV-A, we cannot obtain the analog precoder

˜ k+1 . Note that this result can be applied to sub-matrix G Therefore, G can be obtained by solving the following 2

Gopt = arg min ∥F − G∥F , G

(22)

2

s.t. ∥G∥F ≤ Ns , FR ∈ F ,

where F is a DN × DS block diagonal matrix set with constant modulus entries. As aforementioned, we then decompose the total precoding matrix optimization problem into D precoding sub-matrix optimization problems for each subarray. Thus, the precoding matrix optimization problem for the ith sub-array can be expressed as 2

Gopt i,sub = arg min ∥Fi,sub − Gi,sub ∥F , Gi,sub

(23)

where Gi,sub refers to the constraint precoding sub-matrix of the ith sub-array. Moreover, of Gi,sub is similar [ the structure ] ¯ i , 0 . Therefore, the precoding to Fi,sub , i.e., Gi,sub = 0, G sub-matrix optimization problem (19) can be further expressed as

¯ opt = arg min ¯ ¯ i 2 . G Fi − G i (24) F ¯ Gi

Considering that different streams are transmitted by the same multiple sub-arrays transmitted as discussed in Section IV-A, ¯ i can be the non-zero precoding matrix of the ith sub-array F further partitioned into τi sub-matrices. The number of streams

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 7

precoded by the jth sub-matrix is denoted as βj , where 1 ≤ j ≤ τi , and β1 + β2 + ... + βτi = S. The number of sub-arrays that transmit the streams corresponding to the jth sub-matrix can be indicated by γj . For example, when D = 4 and S = 6, the non-zero precoding matrix for the 1st sub-array can be partitioned into τ1 = 2 sub-matrices as shown in Fig. 2(c), and β1 = 2, γ1 = 2, β2 = 4, γ2 = 4. For brevity, we assume both ¯ i and F ¯ i can be partitioned into τi = 2 sub-matrices, and G the next process can be easily extended to more sub-matrices. ¯ i and F ¯ i can Moreover, according to matirx factorization, G be expressed as [ ] { ¯ i = PB = F ¯ ¯ F [ i 1 , Fi 2 ] (25) ¯ i = MW = G ¯ i1 , G ¯ i2 , G ¯H F ¯ i ) 12 , W = (G ¯ HG ¯ i ) 21 , P and M are N × S where B = (F i i matrices, both consisting of orthonormal column vectors, i.e., PH P = IS and MH M = IS . Defining B = [B1 , B2 ] and W = [W1 , W2 ], we have { ¯ i1 =PB1 , F ¯ i2 =PB2 F (26) ¯ ¯ Gi1 =MW1 , Gi2 =MW2 . Note that for massive MIMO systems, the optimal precoding matrix F satisfies FH F ≈ INs , and each column of F is a unit vector. Thus, B and W can be expressed as  ] ]) 12 [ √1 ([ H ¯ F ¯ i1 F ¯H F ¯ i2  Iβ1 0 F  γ 1 i i  1 1 ≈   B= ¯ ¯H ¯ ¯H F √1 Iβ2 F 0 i 2 i 1 Fi 2 Fi 2 γ2 ] [ 1 1 ([ H ]) 2 √ Iβ1  ¯ G ¯ i1 G ¯ HG ¯ i2 0 G  γ1 i1 i1  ≈ ,   W= ¯ HG ¯ ¯H ¯ √1 Iβ2 G 0 i2 i1 Gi2 Gi2 γ2 (27)

2 ¯ i in (25) can be Fi − G Moreover, the objective function ¯ F transformed to

¯ ¯ i 2 Fi − G {( F ) ( )} ¯i − G ¯i H F ¯i − G ¯i = tr F { H } ¯i F ¯i − F ¯H ¯ ¯H¯ ¯H ¯ = tr F i G i − G i Fi + G i G i { ( H )} ( H ) ( H ) ¯i ¯ i − 2Re tr F ¯ G ¯ i + tr G ¯i G ¯i F = tr F { ([ H i ])} ¯ G ¯ ¯H ¯ ( ) ( ) F i1 i1 Fi1 Gi2 = tr B2 +tr W2 − 2Re tr ¯H G ¯ ¯H ¯ F i2 i1 Fi2 Gi2 ) ( { } β2 β1 ¯H ¯ ¯H ¯ + − 2Re tr(F ≈2 i1 Gi1 )+tr(Fi2 Gi2 ) γ1 γ2 ( ) { } β1 β2 H H H + −2Re tr(BH =2 1 P MW1)+tr(B2 P MW2 ) . γ1 γ2 (28) It is observed that (28) can be minimized by maximizH H H ing tr(BH 1 P MW1 ) and tr(B2 P MW2 ). Define P = [P1 P2 ] (M = [M1 M2 ]), where P1 ∈ CN ×β1 (M1 ∈ CN ×β1 ) and [P2 ∈ CN]×β2 (M2 ∈ CN ×β[2 ). Note that ] T

T

B1 ≈ W1 ≈ √1γ1 Iβ1 , 0 and B2 ≈ W2 ≈ 0, √1γ2 Iβ2 . Therefore, (29) can be further rewritten as ) (

¯ ¯ i 2 ≈ 2 β1 + β2 Fi − G F γ1 γ2 { } 1 1 H − 2Re tr(PH M )+ tr(P M ) . 1 2 1 2 γ1 γ2 (29)

Algorithm 2 Hybrid Digital and Analog Precoders Design Require: F, D, and N 1: for 1 ≤ i ≤ D do 1 ¯H F ¯ i ) 2 , P = B−1 F ¯i 2: B = (F i 3: FR,i = √1N ej∠(P) ¯ B,i = F† F ¯ 4: F R,i i ¯i∥ F ∥ F ¯ B,i = ¯ 5: F ¯ B,i ∥ FB,i ∥FR,i F F 6: end for Return: FR and FB It is by maximizing { evident that } (29) can { beHminimized } Re tr(PH M ) and Re tr(P M ) , which is equivalent 1 1 { }2 2 to maximizing Re tr(PH M) . Note that PH P=IS and MH M=IS . Moreover, each element of M has the same √ amplitude of 1/ N . Therefore, the optimal analog precoding matrix FR,i for the ith sub-array can be expressed as 1 FR,i = √ ej∠(P) . (30) N When FR,i is fixed, the digital precoding sub-matrix for the ¯ B,i can be written as ith sub-array F ¯ B,i = F† F ¯ F R,i i .

(31)

¯ B,i ≈ B As can be seen from (30) and (31), FR,i ≈ P and F when N = ∞. Since γ1 ≪ N and γ2 ≪ N , we can conclude that the digital precoder for the ith sub-array is approximately ¯ B,i ≈ IS , when the number of an identity matrix, i.e., F antennas goes to infinity. To satisfy the power constraint, we ¯i∥ ∥F F ¯ B,i by a factor of normalize F ¯ B,i ∥ to yield ∥FR,i F F



¯ Fi F ¯ B,i =

¯ F (32)

FR,i F ¯ B,i FB,i . F

As described above, we decompose the overall precoding matrix optimization problem (22) into D precoding sub-matrix optimization problems, which is summarized in the pseudocode shown in Algorithm 2. C. Energy Efficiency of mmWave Massive MIMO Systems It is well known that the energy efficiency of a communications system is determined by the spectral efficiency as well as the total power consumption. For mmWave MIMO systems, the energy efficiency can be written as C (33) , Pt + NR PR +NPS PPS +Nt PPA where Pt is the transmitter power, NPS is the required number of phase shifters of the hybrid precoding structure, PR , PPS , PPA indicate the energy consumed by the RF chain, phase shifter, and power amplifier (PA), respectively. In terms of spectral efficiency, the hybridly-connected and fully-connected structures are superior to their partiallyconnected counterpart. This is because the former structure have more degrees of freedom, when the number of RF chains is fixed. It should be emphasized that the number of degrees of freedom in the RF domain for the hybridly-connected structure η=

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 8

TABLE I: Hardware Requirements for Different HBF Structures Structure Fully Partially Hybridly

RF chains NR NR SD

phase shifters NR Nt Nt SNt

Sub-arrays 1 NR D

PAs Nt Nt Nt

increases with the number of RF chains connected with each sub-array. When S = Ns , the number of degrees of freedom for the hybridly-connected structure is comparable with that of the fully-connected one. This means both structures are able to achieve the same spectral efficiency. However, when considering the the energy consumption, which structure has better energy efficiency depends on the required numbers of RF chains, phase shifters, and PAs. In order to compare the required numbers of RF chains, phase shifters, and PAs of the three hybrid precoding structures, we assume the number of transmitted streams Ns and transmit antennas Nt are fixed. Then, the required numbers of RF chains, phase shifters, PAs, and the number of sub-arrays for all the three hybrid precoding structures are compared in Table I. D. Sensitivity to channel estimation errors Considering that obtaining perfect CSI in practice is nearly impossible, we evaluate the performance of the proposed algorithm with imperfect CSI. The channel matrix with imperfect CSI can be modeled as [47], [48] √ ˜ (34) H=ξH+ 1−ξ 2 E, where 0 ≤ ξ ≤ 1 indicates the accuracy of channel estimation, and E represents the error matrix that follows a complex ˜ is Gaussian distribution, i.e., E ∈ CN (0, 1). The rank of H denoted as h = min (Nt , Nr ). Considering that the mmWave channel is sparse, we have rank (H) = L < min (Nt , Nr ). According to the theory of principal component analysis [49], ˜ can be approximated by a low-rank matrix. For [50], H h ∑ ˜ is H= ˜ example, the vector form of the SVD of H σi ui viH , i=1

where σi is the ith largest singular value, and ui and vi are the left and right singular vector corresponding to σi , respectively. r˜ ∑ ˜ can be approximated as Hr˜ = Then H σi ui v H , where i=1

i

˜ have the r˜ < h is the rank of Hr˜. Note that Hr˜ and H same principal component, as can be seen from the left and right singular vectors and the singular values. Meanwhile, ξH and H have the same principal component, and ξH is ˜ approximated to the low-rank L approximation HL of H. In fact, the accuracy of approximation between ξH and HL depends on two factors, i.e., the number of antennas at the transmitter and receiver and the value of ξ. When ξ is a constant, increasing the number of antennas helps enhance the principal component, thus increasing the robustness to noise. When the number of antennas is constant, increasing the value of ξ helps reduce the influence of noise on the principal component of the channel matrix. According to the above analysis, it is concluded that the larger the number of antennas at the transmitter and receiver, the more robust to CSI inaccuracy.

V. S IMULATION R ESULTS This section presents simulation results on spectral efficiency and energy efficiency to demonstrate the performance of our proposed hybridly-connected structure. Meanwhile, we compare the performance of our proposed algorithm using a fixed stream distribution with some recently proposed algorithms for the fully- and partially-connected structures. In our simulations, we consider the narrow-band channel model described in (2) with L= 10 scattering paths [30]. Meanwhile, the AoD ] are assumed to follow a uniform distribution [ and AoA within − π3 , π3 and [ - π, π], respectively. Both the transmitter and receiver are equipped with ULAs, and dt = dr = λ2 . Finally, the signal-to-noise (SNR) is given as SNR = σρ2 , and the average spectral efficiency and energy efficiency plotted are averaged over 1000 channel realizations. A. Spectral Efficiency Firstly, according to the relationship between the number of streams Ns and the equipped number of RF chains SD in the hybridly-connected structure as discussed in Section III, two cases will be discussed, i.e., Ns = SD and Ns < SD. 1) Case 1: Ns = SD: In this case, we assume Ns = SD = 8. Fig. 3 compares the achievable rates of the three HBF structures, where Nt × Nr = 128 × 32, and D ∈ {2, 4}. As can be observed from the comparison in Fig. 3, the achievable rate of the hybridly-connected structure falls between those of the fully- and partially-connected structures. Meanwhile, the achievable rate of the proposed structure with D= 2 outperforms that with D= 4. From this point, we know that the less number of sub-arrays the higher achievable rate is, when the total number of RF chains is equal. Furthermore, the performance of our proposed hybrid constraint precoding algorithm is close to that of its unconstraint counterpart for the hybridly-connected structure. Fig. 4 also shows the comparison of the achievable rates of the three structures, where Nt × Nr = 128 × 32, Ns =DS= 4, and D= 2. Similar observations can be made from Fig. 4 as those from Fig. 3. 2) Case 2: Ns < SD: In this case, we assume the number of sub-arrays is fixed, and the number of RF chains connected by each sub-array varies. The achievable rates with various numbers of RF chains are shown in Fig. 5, where Nt × Nr = 128 × 32, Ns = 8 ≤ DS, D= 4, and S ∈ {2, 4, 8}. It is observed that the achievable rate of optimal hybrid precoding in this case increases with the number of RF chains connected by each sub-array, and the performance of the proposed constraint precoding algorithm approaches that of the optimal unconstraint precoding. Meanwhile, it is noted that the achievable rate of unconstraint precoding for the hybridlyconnected structure is equal to that of the fully-connected structure, when the number of RF chains connected by each sub-array is equal to the number of streams. Fig. 6 depicts the achievable rates when the number of streams Ns = 4 ≤ DS, where Nt × Nr = 128 × 32, D= 2, and S ∈ {2, 4}. Similar conclusions can be made from Fig. 6 as those in Fig. 5. As discussed previously, we know that the fully- and partially-connected structures are the special cases of the bybrid-connected structure. Next, we evaluate the achievable

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 9

40

30

35

Achievable rate (bps/Hz)

Achievable rate (bps/Hz)

35

40 Optimal unconstrained precoding (Fully) Optimal unconstrained precoding (Hybridly) Proposed algorithm (Hybridly) Optimal unconstrained precoding (Partially)

D = 2, S = 4

25 20 15

D = 2, S = 4

10 5 0 -35

-30

-25

-20 -15 SNR (dB)

-10

-5

20 15 S=2 10

30

D = 2, S =2

15

10

5

-30

-25

-20 -15 SNR (dB)

-10

-5

0

Fig. 5: Achievable rates with various numbers of RF chains equipped at each sub-array for an Nt ×Nr = 128×32 hybridlyconnected structure mmWave MIMO system with Ns = NR = 8 ≤ SD, D = 4, and S = 2, 4, 8.

Optimal unconstrained precoding (Fully) Optimal unconstrained precoding (Hybridly) Proposed algorithm (Hybridly) Optimal unconstrained precoding (Partially)

20

0 -35

S=4 25

0 -35

0

Achievable rate (bps/Hz)

Achievable rate (bps/Hz)

25

S=8

30

5

Fig. 3: Comparison of the achievable rates of the three HBF structures for an Nt ×Nr = 128×32 mmWave MIMO system with Ns = SD = NR = 8, D = 2(4), and S = 4(2).

30

Optimal unconstrained precoding (Fully) Optimal unconstrained precoding (Hybridly) Proposed algorithm (Hybridly)

Optimal unconstrained precoding (Fully) Optimal unconstrained precoding (Hybridly) Proposed algorithm (Hybridly)

25 S=4 20 15

S=2

10 5

-30

-25

-20 -15 SNR (dB)

-10

-5

0

Fig. 4: Comparison of the achievable rates of the three HBF structures for an Nt ×Nr = 128×32 mmWave MIMO system with Ns = SD = NR = 4, D = 2, and S = 2. rate of the proposed algorithm for these special cases. Fig. 7 compares the achievable rates of the proposed algorithm, OMP algorithm [17], switch-structure HBF algorithm [26], and directional beamforming method [30], [31] for the mmWave MIMO system, where Nt × Nr =128 × 32, Ns = S = NR , and D = 1. Since the performances of the OMP and directional beamforming algorithms are affected by the number of paths of the channel, simulation results with various path numbers are presented in Fig. 7. As can be observed from Fig. 7(a), the proposed algorithm outperforms the other algorithms when Ns = 2 and L = 5. It can be inferred from Fig. 7(b) that the proposed algorithm outperforms the switch-structure HBF algorithm and directional beamforming algorithm, and is slightly inferior to the OMP algorithm when Ns = 4 and L = 5. Similar conclusions can be drawn from Fig. 7(c) as those from Fig. 7(a) when Ns = 4 and L = 8. It is worth noting that the performance of the proposed algorithm is extremely close to the optimal unconstrained precoding in all simulation configurations, and is much more robust than the other comparative algorithms. This implies the proposed a

0 -35

-30

-25

-20 -15 SNR (dB)

-10

-5

0

Fig. 6: Achievable rates with various numbers of RF chains equipped at each sub-array for an Nt ×Nr = 128×32 hybridlyconnected structure mmWave MIMO system with Ns = NR = 4 ≤ SD, D = 2, and S = 2, 4. -lgorithm not only works effectively in the hybridly-connected structure, but is also applicable to the fully-connected structure. Fig. 8 compares the achievable rates of the proposed algorithm with those of the SIC-based HBF algorithm [25] and switch-structure HBF algorithm [26] in the mmWave MIMO system, where Nt ×Nr =128 × 32, Ns = DS = NR = 4, and D = 4. As can be observed from Fig. 8, the proposed algorithm far outperforms the switch-architecture HBF algorithm, and also surpasses the SIC-based HBF algorithm. Moreover, the proposed algorithm is within a small gap from the performance of the optimal unconstrained precoding, which implies the proposed algorithm works in both the hybridly- and partiallyconnected structures. Fig. 9 compares the achievable rates of different precoding algorithms against the number of RF chains, where Nt ×Nr = 288 × 32, Ns = 8, D=4 and SNR = 0 dB. As can be observed

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 10

25

25 Optimal unconstrained precoding (Fully) Proposed algorithm OMP algorithm in [17] Directional beamforming in [31] MUSIC-based directional beamforming in [30] Switch-structure beamforming in [26]

Achievable rate (bps/Hz)

15

20

18 17

10 16

5

15 -4

0 -35

-3

-30

-25

-20 -15 SNR (dB)

-10

-5

Achievable rate (bps/Hz)

20

-30

-25

-20 -15 SNR (dB)

-10

-5

0

Fig. 8: Comparison of the achievable rates of the comparative algorithms for an Nt ×Nr = 128×32 mmWave MIMO system with Ns = SD = NR = 4, D = 4.

35 Optimal unconstrained precoding (Fully) Proposed algorithm OMP algorithm in [17] Directional beamforming in [31] MUSIC-based directional beamforming in [30] Switch-structure beamforming in [26]

25

10

0 -35

0

(a) L = 5, Ns = 2

30

15

5

-2

27

50

26

15

25

45

10

24 23 -4

-3

-2

5 0 -35

-30

-25

-20 -15 SNR (dB)

-10

-5

0

(b) L = 5, Ns = 4

Achievable rate (bps/Hz)

Achievable rate (bps/Hz)

20

Optimal unconstrained precoding (Partially) SIC-based precoding in [25] (Partially) Proposed algorithm Switch-structure beamforming in [26]

40

35

OMP algorithm in [17] (Fully) SIC-based precoding in [25] (Partially) Proposed algorithm (Hybridly) Optimal digital precoding

30

35 Optimal unconstrained precoding (Fully) Proposed algorithm OMP algorithm in [17] Directional beamforming in [31] MUSIC-based directional beamforming in [30] Switch-structure beamforming in [26]

Achievable rate (bps/Hz)

30 25 20

28

15

26

10 5 0 -35

25 10

15

20 25 Number of RF chains

30

Fig. 9: Achievable rates of the three structures versus the number of RF chains at the transmitter for an Nt × Nr = 288 × 32 mmWave MIMO system, where Ns = 8, D = 4, and SNR = 0 dB.

24

22 -4

-3

-30

of the SIC-based precoding algorithm almost does not vary with the number of RF chains.

-2

-25

-20 -15 SNR (dB)

-10

-5

0

(c) L = 8, Ns = 4

Fig. 7: Comparison of the achievable rates of the comparative algorithms for an Nt ×Nr = 128×32 mmWave MIMO system with D = 1, and L = 5 and L = 8. from the figure, the achievable rate of the proposed algorithm for the hybridly-connected structure increases with the number of RF chains. When DS = 32, the achievable rate of the proposed algorithm is very close to that of the fullydigital precoding algorithm. Moreover, we observe that the performance of the OMP algorithm can approach that of the fully-digital precoding algorithm, when the number of RF chains increases. Meanwhile, it is noted that the performance

B. Energy Efficiency In this part, we will compare the energy efficiency of the three HBF structures according to (33), which is shown in Fig. 10. The simulation parameters are the same as in Fig. 9, and Pt = 10W, PPS = 10mW, PPA = 100mW, and PR = 100mW [46]. As can be seen from the Fig. 10, the energy efficiency of the hybridly-connected structure outperforms those of the fully- and partially-connected structures in all the case. It is worth pointing out that for the hybridly-connected structure, when the number of RF chains is more than the minimum number of RF chains in the fully- and partiallyconnected structures, it is still more energy efficient. For example, for the hybridly-connected structure with SD = 32, the energy efficiency is 0.71 bps/Hz/W, whilst for the fullyand partially-connected structures with NR = 8, the energy ef

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 11

antennas Nt at the transmitter as shown in Table 1. Therefore, the energy efficiency slowly decreases with the increase of the number of RF chains.

0.8

Energy efficient (bps/Hz/W)

0.75 0.7

C. Sensitivity

0.65 0.6 0.55 0.5 0.45 0.4

OMP algorithm in [17] (Fully) SIC-based precoding in [25] (Partially) Proposed algorithm (Hybridly)

0.35 10

15

20 25 Number of RF chains

30

Fig. 10: Energy efficiency of the three structures versus the number of RF chains at the transmitter for an Nt × Nr = 288 × 32 mmWave MIMO system, where Ns = 8, D = 4, and SNR = 0 dB.

To demonstrate the sensitivity of the proposed precoder to the channel estimation error, the achievable rate of the proposed algorithm with various levels of channel estimation errors are presented. Fig. 11 plots the achievable rates of the proposed algorithm with various numbers of antenna at the transmitter and receiver, where Ns = SD = 4, D = 2, S = 2, and perfect CSI and imperfect CSI with different values of ξ are considered. As can be observed from Fig. 11, in the case of the same channel accuracy, the larger the number of antennas at the transmitter and receiver, the less the achievable rate loss, and more robust to channel estimation errors. This observation is consistent with the conclusion based upon our theoretical analysis. VI. C ONCLUSIONS

35 Proposed algorithm, prefect CSI

30

Proposed algorithm with ξ = 0.7 Proposed algorithm with ξ = 0.5

N ×N = 256×64

Achievable rate (bps/Hz)

t

r

25 20

N ×N = 128×32 t

r

15 10 5 N t ×N r = 64×16 0 -35

-30

-25

-20 -15 SNR (dB)

-10

-5

0

Fig. 11: Achievable rate comparison of the hybridly-connected mmWave MIMO system with various numbers of antennas at the transmitter and receiver and various channel estimation errors, where DS = Ns = 4 and D = 2. -ficiencies are 0.69 bps/Hz/W and 0.65 bps/Hz/W, respectively. Moreover, the energy efficiency of the hybridly-connected structure increases first and then decreases slightly with the increase in the number of RF chains. When SD = 16, the energy efficiency is greatest, which provides a reference for the design of the actual system. For the fully-connected structure, with the increase in the number of RF chains, the spectral efficiency grows more slowly as shown in Fig. 9. When NR ≥ 26, the spectral efficiency nearly stops changing with the increase of the number of the RF chains. Therefore, the energy efficiency rapidly reduces with the increase of the number of RF chains as shown in Fig. 10. For the partiallyconnected structure, the spectral efficiency almost does not improve with the increase of the number of RF chains as can be seen from Fig. 9. Meanwhile, the required number of phase shifters is the same, which is equal to the number of

In this paper, we proposed a hybridly-connected structure alongside a hybrid digital and analog precoders design in mmWave MIMO systems. In accordance with the structure of the optimal hybrid precoding matrix, the maximum achievable rate optimization problem was decomposed into a series of sub-rate optimization problems for each stream, since each column of the hybrid precoding matrix corresponds to the precoding vector of one stream. Meanwhile, the optimal hybrid precoding sub-matrix can be achieved according to the SVD of the corresponding channel matrix, and the mutual contribution of the sub-matrices can be eliminated using SIC. Furthermore, according to the factorization of the corresponding optimal precoding sub-matrix of each sub-array, the near-optimal hybrid digital and analog precoders were designed to minimize the Euclidean distance with the optimal precoding sub-matrix. Finally, simulation results were presented to show that the hybridly-connected structure falls in between the partiallyconnected structure and the fully-connected structure in the sense of spectral efficiency. With an increasing number of RF chains, the spectral efficiency can approach that of the fully-connected structure. In terms of energy efficiency, the hybridly-connected structure was shown to outperform its partially-connected and fully-connected counterparts. Meanwhile, the proposed algorithm is insensitive to channel estimation errors. R EFERENCES [1] J. Hoydis, S. ten Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular networks: how many antennas do we need?,” IEEE J. Sel. Areas Commun., vol. 31, no. 2, pp. 160-171, Feb. 2013. [2] E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Commun. Mag., vol. 52, no. 2, pp. 186-195, Feb. 2014. [3] E. Bj¨ ornson, L. Sanguinetti, and M. Kountouris, “Deploying dense networks for maximal energy efficiency: small cells meet massive MIMO,” IEEE J. Sel. Areas Commun., Apr. 2015. [4] C. Li, J. Zhang, and K. B. Letaief, “Throughput and energy efficiency analysis of small cell networks with multi-antenna base stations,” IEEE Trans. Wireless Commun., vol. 13, no. 5, pp. 2505-2517, May 2014.

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 12

[5] A. L. Swindlehurst, E. Ayanoglu, P. Heydari, and F. Capolino, “Millimeter-wave massive MIMO: the next wireless revolution?,” IEEE Commun. Mag., vol. 52, no. 9, pp. 56-62, Sep. 2014. [6] W. Roh et al., “Millimeter-wave beamforming as an enabling technology for 5G cellular communications: theoretical feasibility and prototype results,” IEEE Commun. Mag., vol. 52, no. 2, pp. 106-113, Feb. 2014. [7] H. Sawada, S. Takahashi, and S. Kato, “Disconnection probability improvement by using artificial multi reflectors for millimeter-wave indoor wireless communications,” IEEE Trans. Antennas and Propagation,. vol. 61, no. 4, pp. 1868-1875, Apr. 2013. [8] S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. A. Thomas, and A. Ghosh, “Millimeter wave beamforming for wireless backhaul and access in small cell networks,” IEEE Trans. Commun., vol. 61, no. 10, pp. 43914403, Oct. 2013. [9] D. Li, W. Saad, and C. S. Hong, “Decentralized renewable energy pricing and allocation for millimeter wave cellular backhaul,” IEEE J. Sel. Areas Commun., vol. 34, no. 5, pp. 1140-1159, May 2016. [10] B. Panzner et al., “Deployment and implementation strategies for massive MIMO in 5G,” in Proc. IEEE Globecom Workshops (GC Wkshps), Austin, TX, Dec. 2014, pp. 346-351. [11] Y. Kim et al., “Feasibility of mobile cellular communications at millimeter wave frequency,” IEEE J. Sel. Areas Signal Process., vol. 10, no. 3, pp. 589-599, Apr. 2016. [12] P. Wang, Y. Li, L. Song, and B. Vucetic, “Multi-gigabit millimeter wave wireless communications for 5G: from fixed access to cellular networks,” IEEE Commun. Mag., vol. 53, no. 1, pp. 168-178, Jan. 2015. [13] T. S. Rappaport, E. Ben-Dor, J. N. Murdock, and Y. Qiao, “38 GHz and 60 GHz angle-dependent propagation for cellular & peer-to-peer wireless communications,” in Proc. IEEE Int. Conf. Commun (ICC), Ottawa, Canada, Jun. 2012, pp. 4568-4573. [14] S. Han, C. l. I, Z. Xu, and C. Rowell, “Large-scale antenna systems with hybrid analog and digital beamforming for millimeter wave 5G,” IEEE Commun. Mag., vol. 53, no. 1, pp. 186-194, Jan. 2015. [15] S. Sun, T. S. Rappaport, R. W. Heath, A. Nix, and S. Rangan, “Mimo for millimeter-wave wireless communications: beamforming, spatial multiplexing, or both?,” IEEE Commun. Mag., vol. 52, no. 12, pp. 110121, Dec. 2014. [16] T. E. Bogale, L. B. Le, A. Haghighat, and L. Vandendorpe, “On the number of RF chains and phase shifters, and scheduling design with hybrid analog-digital beamforming,” IEEE Trans. Wireless Commun., vol. 15, no. 5, pp. 3311-3326, May 2016. [17] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE Trans. Wireless Commun., vol. 13, no. 3, pp. 1499-1513, Mar. 2014. [18] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design for large-scale antenna arrays,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 501-513, Apr. 2016. [19] C. Rusu, R. Mndez-Rial, N. Gonzlez-Prelcicy, and R. W. Heath, “Low complexity hybrid sparse precoding and combining in millimeter wave MIMO systems,” in Proc. IEEE Int. Conf. Commun. (ICC), London, UK, Jun. 2015, pp. 1340-1345. [20] W. Ni and X. Dong, “Hybrid block diagonalization for massive multiuser MIMO systems,” IEEE Trans. Commun., vol. 64, no. 1, pp. 201-211, Jan. 2016. [21] R. Zi, X. Ge, J. Thompson, C. X. Wang, H. Wang, and T. Han,“Energy efficiency optimization of 5G radio frequency chain systems,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 758-771, Apr. 2016. [22] Y. Y. Lee, C. H. Wang, and Y. H. Huang, “A hybrid RF/baseband precoding processor based on parallel-index-selection matrix-inversionbypass simultaneous orthogonal matching pursuit for millimeter wave MIMO systems,” IEEE Trans. Signal Process., vol. 63, no. 2, pp. 305317, Jan. 2015. [23] O. El Ayach, R. W. Heath, S. Rajagopal, and Z. Pi, “Multimode precoding in millimeter wave MIMO transmitters with multiple antenna sub-arrays,” in Proc. IEEE Global Telecommun. Conf. (GLOBECOM)., Atlanta, GA, Dec. 2013, pp. 3476-3480. [24] L. Dai, X. Gao, J. Quan, S. Han, and C. L. I, “Near-optimal hybrid analog and digital precoding for downlink mmWave massive MIMO systems,” in Proc. IEEE Int. Conf. Commun. (ICC)., London, Jun. 2015, pp. 1334-1339. [25] X. Gao, L. Dai, S. Han, C. L. I, and R. W. Heath, “Energy-efficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 998-1009, Apr. 2016. [26] R. M´endez-Rial, C. Rusu, N. Gonz´alez-Prelcic, A. Alkhateeb, and R. W. Heath, “Hybrid MIMO architectures for millimeter wave commu-

[27] [28]

[29] [30]

[31]

[32]

[33] [34] [35] [36]

[37] [38] [39] [40]

[41] [42] [43] [44] [45] [46]

[47] [48] [49] [50]

nications: phase shifters or switches?,” IEEE Access., vol. 4, no. , pp. 247-267, 2016. J. A. Zhang, X. Huang, V. Dyadyuk, and Y. J. Guo, “Massive hybrid antenna array for millimeter-wave cellular communications,” IEEE Wireless Commun., vol. 22, no. 1, pp. 79-87, Feb. 2015. X. Yu, J. C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimization algorithms for hybrid precoding in millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 485-500, Apr. 2016. V. Raghavan, R. W. Heath, and A. M. Sayeed, “Systematic codebook designs for quantized beamforming in correlated MIMO channels,” IEEE J. Sel. Areas Commun., vol. 25, no. 7, pp. 1298-1310, Sep. 2007. V. Raghavan, J. Cezanne, S. Subramanian, A. Sampath, and O. Koymen., “Beamforming tradeoffs for initial UE discovery in millimeter-wave MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 543-559, Apr. 2016. V. Raghavan, S. Subramanian, J. Cezanne, A. Sampath, O. Hizir Koymen, and J. Li, “Single-user vs. multi-user precoding for millimeter wave MIMO systems,” IEEE J. Sel. Areas Commun., vol. PP, no. 99, pp. 1-1 V. Raghavan, S. Subramanian, J. Cezanne, A. Sampath, O. Koymen, and J. Li, “Directional hybrid precoding in millimeter-wave MIMO systems,” in Proc IEEE Global Commun Conf (GLOBECOM)., Washington, DC, Dec. 2016, pp. 1-7. V. Raghavan, S. Subramanian, J. Cezanne, and A. Sampath, “Directional beamforming for millimeter-wave MIMO systems,” in Proc IEEE Global Commun Conf (GLOBECOM)., San Diego, CA, Dec. 2015, pp. 1-7. A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Channel estimation and hybrid precoding for millimeter wave cellular systems,” IEEE J. Sel. Signal Process., vol. 8, no. 5, pp. 831-846, Oct. 2014. D. Neumann, M. Joham, and W. Utschick, “Channel estimation in massive MIMO systems,” Mar. 2015. [Online]. Available: http://arxiv.org/abs/1503.08691 J. Lee, G. T. Gil, and Y. H. Lee, “Channel estimation via orthogonal matching pursuit for hybrid MIMO systems in millimeter wave communications,” IEEE Trans. Commun., vol. 64, no. 6, pp. 2370-2386, Jun. 2016. K. K. Mukkavilli, A. Sabharwal, E. Erkip, and B. Aazhang, “On beamforming with finite rate feedback in multiple-antenna systems,” IEEE Trans. Inf. Theory., vol. 49, no. 10, pp. 2562-2579, Oct. 2003. G. Caire, N. Jindal, M. Kobayashi, and N. Ravindran, “Multiuser MIMO achievable rates with downlink training and channel state feedback,” IEEE Trans. Inf. Theory., vol. 56, no. 6, pp. 2845-2866, Jun. 2010. V. Raghavan, J. J. Choi, and D. J. Love, “Design guidelines for limited feedback in the spatially correlated broadcast channel,” IEEE Trans. Commun., vol. 63, no. 7, pp. 2524-2540, Jul. 2015. T. S. Rappaport, F. Gutierrez, E. Ben-Dor, J. N. Murdock, Y. Qiao, and J. I. Tamir, “Broadband millimeter-wave propagation measurements and models using adaptive-beam antennas for outdoor urban cellular communications,” IEEE Trans. Antennas Propag., vol. 61, no. 4, pp. 1850-1859, Apr. 2013. V. Raghavan and A. M. Sayeed, “Multi-antenna capacity of sparse multipath channels,” IEEE Trans. Inf. Theory., 2008. [Online]. Available: dune.ece.wisc.edu/pdfs/sp mimo cap.pdf V. Raghavan and A. M. Sayeed, “Sublinear capacity scaling laws for sparse MIMO channels,” IEEE Trans. Inf. Theory., vol. 57, no. 1, pp. 345-364, Jan. 2011. A. A. M. Saleh and R. Valenzuela, “A statistical model for indoor multipath propagation,” IEEE J. Sel. Areas Commun., vol. 5, no. 2, pp. 128-137, Feb. 1987. C. A. Balanis, Antenna Theory: Analysis and Design, 2012, Wiley G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore, USA: JHU Press, 2013. R. W. Heath, N. Gonzlez-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An overview of signal processing techniques for millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 436-453, Apr. 2016. N. Jindal, “MIMO broadcast channels with finite-rate feedback,” IEEE Trans. Inf. Theory., vol. 52, no. 11, pp. 5045-5060, Nov. 2006. F. Rusek et al., “Scaling up MIMO: opportunities and challenges with very large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40-60, Jan. 2013. C. Eckart and G. Young, “The approximation of one matrix by another of lower rank,” Psychometrika, 1936, 1: 211-218. I. T. Jolliffe, Principal Component Analysis. New York: Springer-Verlag, 1986.

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCOMM.2017.2756882, IEEE Transactions on Communications 13

Didi Zhang received the B.E. degree in electronic and information engineering from North College of Beijing University of Chemical Technology, Langfang, China, and the M.S. degree in information and communication engineering from North China University of Technology, Beijing, China, in 2012 and 2015, respectively. He is currently pursuing the Ph.D. degree in information and communication engineering from Beijing University of Posts and Telecommunications, Beijing, China. His research interests include massive MIMO, mmWave communications, wireless communications, and signal processing.

Yafeng Wang (S’00-M’03-SM’09) received his Ph.D., M.Eng. and BSc. from Beijing University of Posts and Telecommunications, University of Electronic Science and Technology of China, and Baoji University of Arts and Science in 2003, 2000 and 1997, respectively. He is currently a professor of electronic engineering in the School of Information and Telecommunications at Beijing University of Posts and Telecommunications. He leads the Broadband Mobile Communication Engineering Lab, which is one of Zhongguancun Science Park Open Labs. In 2008, he was a visiting scholar in the Faculty of Engineering and Surveying at University of Southern Queensland, Australia. He has published over 100 papers in peer-reviewed journal and conference papers. His research mainly focuses on wireless communications and information theory.

Wei Xiang (S’00-M’04-SM’10) received the B.Eng. and M.Eng. degrees, both in electronic engineering, from the University of Electronic Science and Technology of China, Chengdu, China, in 1997 and 2000, respectively, and the Ph.D. degree in telecommunications engineering from the University of South Australia, Adelaide, Australia, in 2004. He is currently Founding Professor and Head of Discipline of Internet of Things Engineering in the College of Science and Engineering at James Cook University, Cairns, Australia. During 2004 and 2015, he was with the School of Mechanical and Electrical Engineering, University of Southern Queensland, Toowoomba, Australia. He is an elected Fellow of the IET and Engineers Australia. He received the TNQ Innovation Award in November 2016, and was a finalist for 2016 Pearcey Queensland Award. He was a co-recipient of three Best Paper Awards at 2015 WCSP, 2011 IEEE WCNC, and 2009 ICWMC. He has been awarded several prestigious fellowship titles. He was named a Queensland International Fellow (20102011) by the Queensland Government of Australia, an Endeavour Research Fellow (2012-2013) by the Commonwealth Government of Australia, a Smart Futures Fellow (2012-2015) by the Queensland Government of Australia, and a JSPS Invitational Fellow jointly by the Australian Academy of Science and Japanese Society for Promotion of Science (2014-2015). He is the Vice Chair of the IEEE Northern Australia Section. He was an Editor for IEEE Communications Letters (2015-2017), and is an Associate Editor for Springers Telecommunications Systems. He has published over 200 papers in peerreviewed journal and conference papers. He has severed in a large number of international conferences in the capacity of General Co-Chair, TPC CoChair, Symposium Chair, etc. His research interests are in the broad areas of communications and information theory, particularly the Internet of Things, and coding and signal processing for multimedia communications systems.

Xuehua Li received her Ph.D. degree in telecommunications engineering from the Beijing University of Posts and Telecommunications, Beijing, China, in 2008. She is now Professor and Deputy Dean of the School of Information and Communication Engineering at Beijing Information Science and Technology University, Beijing, China. She is a senior member of the Beijing Internet of Things Institute. Her research interests are in the broad areas of communications and information theory, particularly the Internet of Things, and coding for multimedia communications systems.

0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.