A Correlation-Breaking Interleaving of Polar Codes

1 downloads 0 Views 334KB Size Report
Feb 17, 2017 - Abstract—It's known that the bit errors of polar codes with successive cancellation (SC) decoding are coupled. However, existing concatenation ...
IEEE TRANSACTIONS ON COMMUNICATIONS

1

A Correlation-Breaking Interleaving of Polar Codes

arXiv:1702.05202v1 [cs.IT] 17 Feb 2017

Ya Meng, Student Member, IEEE, Liping Li, Member, IEEE, Chuan Zhang, Member, IEEE

Abstract—It’s known that the bit errors of polar codes with successive cancellation (SC) decoding are coupled. However, existing concatenation schemes of polar codes with other error correction codes rarely take this coupling effect into consideration. To achieve a better BER performance, a concatenation scheme, which divides all Nl bits in a LDPC block into Nl polar blocks to completely de-correlate the possible coupled errors, is first proposed. This interleaving is called the blind interleaving (BI) and can keep the simple SC polar decoding while achieving a better BER performance than the state-of-the-art (SOA) concatenation of polar codes with LDPC codes. For better balance of performance and complexity, a novel interleaving scheme, named the correlation-breaking interleaving (CBI), is proposed by breaking the correlation of the errors among the correlated bits of polar codes. This CBI scheme 1) achieves a comparable BER performance as the BI scheme with a smaller memory size and a shorter turnaround time; 2) and enjoys a performance robustness with reduced block lengths. Numerical results have shown that CBI with small block lengths achieves almost the same performance at BER=10−4 compared with CBI with block lengths 8 times larger. Index Terms—Polar codes, SC decoding, BP decoding, interleaving, code concatenation

I. I NTRODUCTION

P

OLAR codes, which are proposed in [1], provably achieve the capacity of symmetric binary-input discrete memoryless channels (B-DMCs) with a low encoding and decoding complexity. The encoding and decoding process (with successive cancellation, SC) can be implemented with a complexity of O(N log N ). The idea of polar codes is to transmit information bits on those noiseless channels while fixing the information bits on those completely noisy channels. The fixed bits are made known to both the transmitter and receiver. Later, the systematic version of polar codes was proposed in [2]. The construction of polar codes is studied in [3–6] and the hardware implementation is presented in [7–9]. The asymptotic behavior of polar codes are analyzed in [10], where the polarization rate is characterized in both the block length and the code rate. The finite scaling of polar This work was supported in part by National Natural Science Foundation of China through grant 61501002, in part by Natural Science Project of Ministry of Education of Anhui through grant KJ2015A102, in part by Talents Recruitment Program of Anhui University, in part by the Key Laboratory Project of the Key Laboratory of Intelligent Computing and Signal Processing of the Ministry of Education of China, Anhui University. This paper was presented in part at the IEEE Vehicular Technology Conference Fall, Montreal, 2016. (Corresponding author: Liping Li.) Ya Meng and Liping Li are with the Key Laboratory of Intelligent Computing and Signal Processing of the Ministry of Education of China, Anhui University. Chuan Zhang is with the National Mobile Communications Research Laboratory, Southeast University, Nanjing, China (e-mail: [email protected]; liping [email protected]; [email protected]).

codes is presented in [11] in which the relationship between the error probability, the code rate, and the block length is analyzed. However, it has been shown that the performance of polar codes in the finite domain is not satisfactory [12–15]. To improve the polar code performance in the finite domain, various decoding processes [12, 13, 16, 17] and concatenation schemes [14, 18, 19] were proposed. The decoding processes in these works have higher complexity than the original SC decoding of [1]. On the other hand, systematic polar codes [2] can achieve a better BER performance than non-systematic polar codes while still maintain almost the same decoding complexity. In non-systematic encoding, the codeword x is obtained by x = uG, where u is the source vector and G is the generator matrix. The basic idea of systematic polar codes is to use some part of the codeword x to transmit information bits instead of directly using the source bits u to transmit them. In [2], it’s shown that systematic polar codes achieve better BER performance than non-systematic polar codes. But, theoretically, this better BER performance is not ˆ expected from the indirect decoding process: first decoding u (ˆ u is the estimation of u from the normal SC decoding) then ˆ as u ˆ G. One would expect that any errors in u ˆ re-encoding x would be amplified in this re-encoding process. The bit errors occurring in the SC decoding affect the sequel bit errors, causing the problem of the error propagation. The better BER performance of systematic polar codes can be intuitively thought of coming from the error-decoupling. From the two-step decoding of systematic polar codes, this decoupling must be accomplished through the re-encoding ˆ = u ˆ G. From x ˆ = u ˆ G and that the number of errors in x ˆ is smaller than that of u ˆ , we can conclude that the coupling x ˆ are controlled by the columns of G. We of the errors in u provide a proposition of this coupling pattern in this paper. In this paper, two schemes are proposed to de-correlate the coupled errors. A concatenation scheme, which divides all Nl bits in a LDPC block into Nl polar blocks to completely decorrelate the possible coupled errors, is first introduced. The blind interleaving (BI) can keep the simple SC polar decoding while achieving a better BER performance than the stateof-the-art (SOA) concatenation of polar codes with LDPC codes. To better balance the performance and complexity, a novel interleaving scheme, named the correlation-breaking interleaving (CBI), is introduced to improve the performance of polar codes with finite block lengths while still maintaining the low complexity of the SC decoding. This CBI scheme is based on the correlation pattern proven in this paper. LDPC and BCH codes are used as the outer codes and polar codes are the inner codes. Note that the concatenation of polar

2

IEEE TRANSACTIONS ON COMMUNICATIONS

codes with LDPC codes is studied in [14] and [18] where no interleaving is used and BP (belief-propagation) decoding is applied for polar codes. For the ease of description, let’s denote polar codes applying SC decoding as POLAR-SC and polar codes applying BP decoding as POLAR-BP. Also let’s denote the direct concatenation system with a LDPC code as the outer code and a polar code as the inner code as LDPC+POLAR. If a CBI interleaving is used between the outer and the inner code, then we denote such a system as LDPC+CBI+POLAR. Similarly, a blind interleaving system is denoted as LDPC+BI+POLAR. The simulation results verified that the LDPC+CBI+POLAR-SC scheme achieves a better BER performance than the direct concatenation scheme of LDPC+POLAR-BP. With coupled errors being decoupled, this CBI scheme applied to shorter LDPC and polar codes achieves a great BER performance which shows the promising potentials of this scheme in power-limited areas. To further explore the advantage of the CBI scheme, a short BCH code is used in place of the short LDPC code, which again shows a surprising good performance. Note that portions of this work are investigated in [20] where the theorems of the error patterns are not proven. In [20], details of the CBI algorithm and some of the key parameters involved are omitted either because of the space limit. In this paper, we provide the proofs of the theorems, implementation details, and examples of the CBI scheme. The contribution of this paper can be summarized as: 1) Theoretically, we prove that the errors from the SC decoding are coupled. The coupling pattern is found; 2) A BI scheme and a universal CBI scheme (based on the coupling pattern) are proposed; 3) A complete implementation of the CBI is provided in algorithms, with details and examples to illustrate the key parameters. Following the notations in [1], in this paper, we use v1N to represent a row vector with elements (v1 , v2 , ..., vN ). We also use v to represent the same vector for notational convenience. For a vector v1N , the vector vij is a subvector (vi , ..., vj ) with 1 ≤ i, j ≤ N . If there is a set A ∈ {1, 2, ..., N }, then vA denotes a subvector with elements in {vi , i ∈ A}. The rest of the paper is organized as follows. Section II introduces the fundamentals of non-systematic and systematic polar codes. Also introduced in this section is the coupling pattern of polar codes with the SC decoding. Section III proposes the new interleaving scheme with detailed algorithms. Section IV presents the simulation results. The conclusion remarks are provided at the end. II. N ON -S YSTEMATIC AND S YSTEMATIC P OLAR C ODES In the first part of this section, the relevant theories on non-systematic polar codes and systematic polar codes are presented. In the second part of this section, the correlation among the SC decoding errors is introduced, which is the basis of the proposed interleaving scheme in Section III. A. Preliminaries of Non-Systematic Polar Codes ⊗n The generator matrix for polar codes where  is GN = BF 1 0 B is a bit-reversal matrix, F = 1 1 , n = log2 N , and F ⊗n is

u1

x1

u5

x2

u3

x3

u7

x4

u2

x5

u6

x6

u4

x7

u8

x8

Fig. 1. An encoding circuit of the non-systematic polar codes with N = 8. Signals flow from the left to the right. Each edge carries a signal of 0 or 1.

the nth Kronecker power of the matrix F over the binary field F2 . In this paper, we consider an encoding matrix G = F ⊗n without the permutation matrix B. The generator matrix is the basis for devising the new interleaving scheme in this paper. Mathematically, the encoding is a process to obtain the codeword x through x = uG for a given source vector u. The source vector u consists of the information bits and the frozen bits, denoted by uA and uA¯, respectively. Here the set A includes the indices for the information bits and A¯ is the complementary set. Each bit goes through a bit channel. In [1], there are detailed definitions of the bit channels and the corresponding Bhattacharyya parameter for each bit channel. The set A can be constructed by selecting indices of the bit channels with the smallest Bhattacharyya parameters in BEC channels. For all the other channels, construction methods can be found in [3–6]. Both sets A and A¯ are in {1, 2, ..., N } for polar codes with a block length N = 2n . An encoding diagram is shown in Fig. 1. If the nodes in Fig. 1 are viewed as memory elements, the encoding process is to calculate the corresponding binary values to fill all the memory elements from the left to the right. This view is helpful when it comes to systematic polar codes in the following section. B. Construction of Systematic Polar Codes For systematic polar codes, we also focus on a generator matrix without the permutation matrix B, namely G = F ⊗n . The source bits u can be split as u = (uA , uA¯). The first part uA consists of user data that are free to change in each round of transmission, while the second part uA¯ consists of data that are frozen at the beginning of each session and made known to the decoder. The codeword can then be expressed as x = uA GA + uA¯GA¯ (1) where GA is the sub-matrix of G with rows specified by the set A. The systematic polar code is constructed by specifying a set of indices of the codeword x as the indices to convey the information bits. Denote this set as B and the complementary ¯ The codeword x is thus split as (xB , xB¯). With some set as B. manipulations, we have

SUBMITTED PAPER

3

(

xB = uA GAB + uA¯GAB ¯ xB¯ = uA GAB¯ + uA¯GA¯B¯

(2)

The matrix GAB is a sub-matrix of the generator matrix with elements {Gi,j }i∈A,j∈B . Given a non-systematic encoder (A, uA¯), there is a systematic encoder (B, uA¯) which performs the mapping xB 7→ x = (xB , xB¯). To realize this systematic mapping, xB¯ needs to be computed for any given information bits xB . To this end, we see from (2) that xB¯ can be computed if uA is known. The vector uA can be obtained as the following −1 (3) uA = (xB − uA¯GAB ¯ )(GAB ) From (3), it’s seen that xB 7→ uA is one-to-one if xB has the same elements as uA and if GAB is invertible. In [2], it’s shown that B = A satisfies all these conditions in order to establish the one-to-one mapping xB 7→ uA . In the rest of the paper, the systematic encoding of polar codes adopts this selection of B: B = A. Therefore we can rewrite (2) as ( xA = uA GAA + uA¯GAA ¯ (4) xA¯ = uA GAA¯ + uA¯GA¯A¯ Let’s go back to the diagram in Fig. 1. For systematic polar codes, the information bits are now conveyed in the righthand side in xA . To calculate xA¯, uA in the left-hand side needs to be calculated first. Once uA is obtained, systematic encoding can be performed in the same way as the nonsystematic encoding: performing binary additions from the left to the right. Therefore, compared with non-systematic encoding, systematic encoding has an additional round of binary additions from the right to the left. The detailed analysis of systematic encoding can be found in [21, 22]. C. Correlated Bits In [2][23], it’s shown that the re-encoding process of ˆ = u ˆ G after decoding u ˆ does not amplify the number of x ˆ . Instead, there are less errors in x ˆ than in u ˆ . This errors in u ˆ are de-coupled (or clearly shows that the coupled errors in u cancelled) in the re-encoding process. In this section, we first restate a corollary from [23] and then provide a proposition ˆ . This coupling to show the coupling pattern of the errors in u pattern is used in Section III to design the interleaving scheme. Corollary 1. The matrix GAA ¯ = 0. The proof of this corollary can be found in [23]. The following proposition shows the pattern of the coupling of ˆ from the SC decoding process. the errors in u Proposition 1. Let the indices of the non-zero entries of column i ∈ A of G be Ai . Then, the errors of u ˆAi are dependent (or coupled). Proof: Assume the errors in u ˆAi are independent. For non-systematic polar codes, we define a set At ⊂ A containing the indices of the information bits in error in an error event. In the same way, we define a set Asys,t ⊂ A containing the corresponding indices of the information bits in error for systematic polar codes. Let v be an error indicator vector: a

TABLE I C OUPLING E FFECT FOR N = 16, R = 0.5 IN A BEC C HANNEL WITH AN E RASURE P ROBABILITY OF 0.2 Columns of G 10 11 13

Coupling coefficient 76% 74% 74%

N -element vector with 1s in the positions specified by the error event At and 0s elsewhere. Let the error probability being: Pr(vm = 1) = pm (pm ∈ [0, 12 ]). Correspondingly, we set a vector q with 1s in the positions specified by Asys,t and 0s elsewhere. From the systematic encoding process, we have q = vG. Correspondingly, qA = vG(:, A) where G(:, A) denotes the submatirx of G composed of the columns specified by A. Since the frozen bits are always correctly −K determined, vA¯ = 0N . This leads to qA = vA GAA . In 1 this way, we convert the errors of non-systematic polar codes and systematic polar codes to the weight of the vectors v and q. Denote the Hamming weight of the vector v as wH (v). Specifically, the element qi (i ∈ A) is one if vAi has an odd number of ones. With the independent assumption of errors in ˆi is in error u ˆAi , the probability that the ith information bit x is K 1 1 Yi (1 − 2pm ) (5) p˜i = − 2 2 m=1 where Ki = |Ai |. In (5), we can order the probabilities i {pm }K m=1 (0 ≤ pm ≤ 0.5) in the ascending order. Applying the Monotone Convergence Theorem to real numbers [24], we have: K 1 1 1 Yi (1 − 2pm )] = lim p˜i = lim [ − Ki →∞ Ki →∞ 2 2 m=1 2

(6)

Thus, the mean weight of q: wH (q) = K 2 ≥ wH (v), meaning the average number of errors of the systematic polar codes is larger than the average number of errors of non-systematic polar codes. This contradicts with the existing results that systematic polar codes outperform non-systematic polar codes. Thus, we can conclude the errors of uAi are dependent. ˆ From Proposition 1, an error pattern among the errors in u can be deduced. We call bits u ˆAi the correlated estimated bits. This says that statistically, the errors of bits u ˆAi are coupled. To show this coupling, we give an example of N = 16 and R = 0.5 in a BEC channel with an erasure probability 0.2. The indices selected in this case for information bits are {8, 10, 11, 12, 13, 14, 15, 16} (indexed from 1 to 16). The coupling effect (similar to the correlation coefficient) of bits indicated by non-zero positions of column 10, 11, and 13 is recorded in simulations and is shown in Table I. From Table I, we can see that if there are errors in u ˆA10 , then 76% of times these errors happen simultaneously, resulting in a coupling coefficient 0.76 for errors in u ˆA10 . Both the coupling ˆA13 is 0.74 in Table I. coefficient for u ˆA11 and u To the authors’ knowledge, there is no attempt yet to utilize this coupling pattern to improve the performance of polar codes. In the next section of this paper, we propose a novel interleaving scheme to break the coupling of errors to improve

4

IEEE TRANSACTIONS ON COMMUNICATIONS

Original Input Bits Grouping group 1 64 bits

group 2 64 bits

Ă

group 64 64 bits

LDPC encoding 1

2

... 155

1

2

... 155

Ă

1

1

...

2

2

...

Ă

1

2

... 155

Interleaving

1

2

155 155

...

155

Polar encoding Polar group 1

Polar group 2

Ă

Polar group 155

Fig. 2. A blind interleaving (BI) scheme. The block length of the LDPC code is Nl = 155, and the code rate is 64/155. The block length of the polar code is N = 256, and the code rate is R = 1/4.

the BER performance of polar codes while still maintaining the low complexity of the SC decoding. III. T HE P ROPOSED I NTERLEAVING S CHEMES In this section we consider an interleaving scheme between a LDPC code (the outer code) and a polar code (the inner code). From Proposition 1, we know the exact correlated information bits of polar codes. The task of the interleaving scheme is thus to make sure that the correlated bits of the inner polar codes come from different LDPC blocks. In this way, the de-interleaved LDPC blocks have independent errors. We call this interleaving scheme the correlation-breaking interleaving (CBI). Before explaining the novel interleaving scheme, a blind interleaving (BI) is introduced which breaks all bits in one LDPC block into different polar code blocks. The scheme guarantees that errors in each LDPC block are independent. A. The Blind Interleaving Scheme In this section, the scheme of scattering all bits in a LDPC block into different polar code blocks is introduced. Suppose one LDPC code block has Kl information bits and the block length is Nl . These Nl bits are divided into Nl polar code blocks, which guarantees that the errors in each LDPC block are independent as they come from different polar code blocks. We give an example in Fig. 2 where Kl = 64 and Nl = 155. Polar code in this example has N = 256, K = 64 and a rate R = 1/4. In Fig. 2, bit i of all the K = 64 LDPC code blocks forms the input vector to the ith polar code encoder. In the receiver side, to collect the decoded bits of the first LDPC block, Nl polar blocks are needed. For the rest Nl − 1 LDPC blocks, the decoder does not need to wait since all polar bits of one circulation are obtained already. For this BI scheme, the decoder needs to store the real LR values of a block of [155, 64]. The turnaround time of this scheme is Nl × N × Ts where Ts is the symbol duration. B. The Correlation-Breaking Interleaving (CBI) Scheme The BI scheme in Section III-A occupies a big memory and has the longest possible turnaround time. From Section II-C, we know that it is not necessary to scatter all bits in a

LDPC block into different polar blocks since not all bits in a polar block are correlated. Only those bits in {Ai }K i=1 are correlated. The interleaving scheme in this section is to make the bits {Ai }K i=1 of one polar block composed of different LDPC blocks. Or in other words, the interleaving scheme is to scatter the information bits {Ai }K i=1 of each polar block into different LDPC blocks. The difficulty in designing a CBI scheme is that the sets {Ai }K i=1 are different for different block lengths and data rates. They are also different for different underlying channels for which polar codes are designed. A CBI scheme is dependent on at least three parameters: the block length N , the data rate R, and the underlying channel W . Let’s denote a CBI scheme as CBI(N ,R,W ) to show this dependence. A CBI(N ,R,W ) optimized for one set of (N ,R,W ) is not necessarily optimized for another set (N ′ ,R′ ,W ′ ). It may not even work for the set (N ′ ,R′ ,W ′ ) if N ′ R′ 6= N R. In the following, we provide a CBI scheme which works for any sets of (N ,R,W ), but not necessarily optimal for one specific set of (N ,R,W ). As Ai are the indices of the non-zero entries of column i ∈ A, we first extract the K = |A| columns of G and denote it as the submatrix G(:, A). Divide the submatrix G(:, A) = [GAA GAA ]. Since the submatrix GAA ¯ ¯ = 0 from Corollary 1, we only need to analyze the submatrix GAA . If a CBI needs to look at each individual set Ai , then a general CBI is beyond reach. However, we can simplify this problem by dividing the information bits only into two groups: the correlated bits Ac and the uncorrelated bits A¯c . The following proposition can be used to find the sets Ac and A¯c . Proposition 2. For the submatrix GAA , the row indices (relative to the submatrix GAA ) with Hamming weight greater than one is denoted as the set Acs . The corresponding set of Acs with respect to the matrix G is the set Ac . Proof: For 1 ≤ i ≤ K, assume the weight of the ith row of GAA is wi . We need to prove:  i ∈ A¯cs , if wi = 1 (7) i ∈ Acs , if wi > 1

where A¯cs is the complementary set of Acs . Now, for wi = 1, the ith information bit appears only in the ith column of GAA . This means that the ith information bit does not affect other information bits. Nor do other information bits affect the ith information bit. For wj = k (k > 1), we define the set Γj containing the positions of ones in the jth row of GAA . Then, the jth information bit appears not only in the jth column but also in other columns of GAA . Divide Γj into two parts: Γjl and Γjg where Γjl containing indices less than j and Γjg containing indices greater than j. In the decoding process, the jth information bit is affected by the previously decoded bits in Γjl . In the mean time, the decoded jth bit affects the bits in Γjg since all indices in Γjg are greater than j. Therefore, the information bits in Γj are in the set Acs . We give an example of how to use Proposition 2 to find the set Ac and A¯c . Let the block length be N = 16, the code rate R = 0.5, and the underlying channel is the BEC channel with an erasure probability 0.2. The set A is the same as the example in Section II-C. With Proposition 2, we can easily find

SUBMITTED PAPER

5

that Acs = {4, 6, 7, 8} for the submatrix GAA . Relative to the matrix G16 , this set is Ac = {12, 14, 15, 16}. The uncorrelated set is thus A¯c = {8, 10, 11, 13}. With the sets Ac and A¯c obtained for any (N ,R,W ), we can devise a CBI scheme. Let Kc = |Ac | and Kuc = |A¯c |. Fig. 3 is a general CBI scheme. To describe the algorithms, the following parameters are needed and defined: nd = ⌊Nl /K⌋, nu = ⌈Nl /K⌉, kl = Nl %K (modulo operation), Kn = Kc + 1, dm = kl − Kuc , dn = kl − Kn ,  nu × K n , if kl ≥ Kn (8) np = nd × Kn + kl , o.w. and ǫ(dn ) =



1, if dn ≥ 0 0, o.w.

(9)

The parameter np is the total number of polar blocks to transmit Kn LDPC blocks. We have two nu cycles in Algorithm 1 to collect the uncorrelated and correlated bits respectively for polar blocks. For both cycles, the first nd cycles run every Kn polar blocks. In the meantime, we define two new sets Acc and Acp , which control the collection of the correlated information bits for each polar block. The sets Acc and Acp contain the indices of the bits before and after the set A¯c , respectively. Two examples are given in Table II and Table III to explain these parameters. The polar code is a (32,16) code and the LDPC is a (21,8) code in the example shown in Table II. The correlated set Ac = {16, 24, 26, 27, 28, 29, 30, 31, 32}. Therefore Kc = |Ac | = 9 and Kn = Kc + 1 = 10. In this case, kl = 5 < Kn . In Table II, the top row contains indices of the LDPC blocks and the first column is the indices of the polar blocks. Therefore, in this example, to transmit Kn = 10 LDPC blocks, np = 15 polar blocks are needed. From Table II, we can see that for polar block 1, 7 bits are taken from LDPC block 1, and the other 9 bits are taken from LDPC blocks 2 to 10. The 7 bits from LDPC block 1 are placed at the uncorrelated positions A¯c of polar block 1. The other information bits of polar block 1 are divided into two parts: the positions before and after A¯c , which are collected in the sets Acc and Acp respectively. For polar block 1, the set Acc = ∅ and the set Acp = {2, 3, 4, 5, 6, 7, 8, 9, 10}. The other polar blocks follow the same fashion in collecting the input bits. Table III shows another example when kl ≥ Kn . In this example, the polar code (32,8) has an Ac = {28, 30, 31, 32} with Kc = 4. The parameter kl = 5 and Kn = Kc + 1 = 5. The total polar blocks np = nu × Kn = 3 × 5 are used to transmit Kn = 5 LDPC blocks. For both examples, there are 0s at the left low corner, which means that there are polar positions which are not used. These positions are wasted which are the cost of the universal CBI design. However, some of these positions can be turned to frozen bits to increase the performance of the overall interleaving scheme. This operation is just taking these bits as the frozen bits in the encoding process. Therefore it does not change the complexity of the decoding. Pseudocodes in Algorithms 1−5 show a detailed implementation of the proposed interleaving scheme. The top function is given in Algorithm 1 which contains two nu cycles. Specifically, lines from 1 to 17 are collecting the encoded

TABLE II T HE CBI S CHEME FOR LDPC (21,8) AND P OLAR (32,16)

PP LDPC P Polar PP P 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1

2

3

4

5

6

7

8

9

10

7 1 1 1 1 1 1 1 1 1 5 0 0 0 0

1 7 1 1 1 1 1 1 1 1 1 4 0 0 0

1 1 7 1 1 1 1 1 1 1 1 1 3 0 0

1 1 1 7 1 1 1 1 1 1 1 1 1 2 0

1 1 1 1 7 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 7 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 7 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 7 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 7 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 7 1 1 1 1 1

TABLE III T HE CBI S CHEME FOR LDPC (21,8) AND P OLAR (32,8)

PP LDPC P Polar PP P 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1

2

3

4

5

4 1 1 1 1 4 1 1 1 1 4 1 0 0 0

1 4 1 1 1 1 4 1 1 1 1 4 0 0 0

1 1 4 1 1 1 1 4 1 1 1 1 3 0 0

1 1 1 4 1 1 1 1 4 1 1 1 1 2 0

1 1 1 1 4 1 1 1 1 4 1 1 1 1 1

LDPC bits as the uncorrelated information bits of polar groups. Lines from 18 until the end collect the encoded LDPC bits for the correlated information bits of polar groups. The principle in collecting the information bits for polar groups is that the correlated information bits come from different LDPC blocks while the uncorrelated information bits can be from the same LDPC block. Therefore the Kuc uncorrelated information bits of one polar block can be directly taken from a continuous chunk of a LDPC block. However, taking the Kc correlated information bits for each polar encoding block needs a fine design which are controlled by the sets Acc (line 30) and Acp (line 33). The first part of the top level function in Algorithm 1 (lines from 1 to 17, collecting the uncorrelated bits for polar blocks) has two cases: Case 1 when j = nu && kl ! = 0 and Case 2 when (j! = nu && kl ! = 0) || kl = 0. Case 1 (from line 3 to 10) implemented in detail in Algorithms 2−3 carries out the last circle of collecting polar bits. Case 1 is further divided into two parts: kl < Kn (from line 4 to 6) and kl ≥ Kn (from line 7 to 9). The part of kl < Kn is implemented in Algorithm 2 and kl ≥ Kn is implemented in Algorithm 3. Case 2 (from line 11 to 16) carries out the circle of collecting polar bits while kl = 0 or the first nu − 1 circles while kl ! = 0. The second part of Algorithm 1 (lines from 18 to 38) follows the

6

IEEE TRANSACTIONS ON COMMUNICATIONS

TABLE IV T HE C OMPLEXITY C OMPARISON FOR THE CBI S CHEME AND THE BI S CHEME . T HE M EMORY S IZE I S IN T ERMS OF THE LR VALUES OF THE P OLAR C ODE .

❳❳❳Complexity ❳ Scheme ❳❳❳ ❳ CBI scheme BI scheme

Memory size

Turnaround time

[np , K] [Nl , K]

np × N × Ts Nl × N × Ts

same fashion as the first part. C. Complexity Analysis Proposition 3. For the CBI scheme, the memory size is [np , K] LR values of the polar code and the decoder only needs to wait np polar blocks to decode the Kn LDPC blocks. The turnaround time is therefore np × N × Ts . For the BI scheme, the memory size is [Nl , K] and the turnaround time is Nl × N × Ts . Proof: For the memory size and the turnaround time of the CBI scheme, the calculation details are as the following. • Case 1: When kl < Kn && kl ! = 0. There are two parts. For the first part kl > Kuc , to collect the decoded bits of the first LDPC block, (Kn − 1) × nd + nu + dm polar blocks are needed. From the second to (dm + 1)th LDPC block, the decoder does not need to wait and uses the decoded bits of (Kn − 1) × nd + nu + dm polar blocks to decode directly. For LDPC block from (dm + 2) to Kn , only another polar block is needed. For the second part kl ≤ Kuc , to collect the decoded bits of the first LDPC block, (Kn −1)×nd +nu polar blocks are needed. From the second to kl th LDPC block, an additional polar block is needed. For LDPC blocks from kl + 1 to Kn , the decoder does not need to wait since all polar bits of one circulation are obtained already. Therefore, the memory size is [np , K] and the decoder only needs to wait np polar blocks to decode the Kn LDPC blocks. The turnaround time is therefore np × N × Ts . • Case 2: When kl ≥ Kn . Case 2 is also divided into two parts. For the first part kl > Kuc , to collect the decoded bits of the first LDPC block, (Kn −1)×nd +nu +dm polar blocks are needed. From the second to (dm + 1)th LDPC block, the decoder does not need to wait and uses the known decoded bits of polar blocks to decode directly. For LDPC block from (dm + 2) to Kn , only another polar block is needed. For the second part kl ≤ Kuc , the required polar blocks of the first LDPC block is the same as the required polar blocks of the first LDPC block in part two of case one. For the LDPC block from 2 to Kn , another polar block is required. Therefore, the memory size is [np , K] and the decoder only needs to wait np polar blocks to decode the Kn LDPC blocks. The turnround time is the same as the previous case.

Algorithm 1 The algorithm of a general correlation-breaking interleaving scheme. Transmitting Kn LDPC blocks needs np groups of polar codes Input: Nl , K, Kc , Kuc , Ac , A¯c , UL ; Output: the input bits UA of polar codes; 1: for j = 1 to nu do 2: // collect bits from LDPC blocks as input bits A¯c for the ith polar encoding. 3: if j == nu && kl ! = 0 then 4: if kl < Kn then 5: UA = Fusn (UL ) 6: end if 7: if kl ≥ Kn then 8: UA = Fubn (UL ) 9: end if 10: end if 11: if (j! = nu && kl ! = 0) || kl = 0 then 12: for i = (j − 1) × Kn + 1 to (j − 1) × Kn + Kn do 13: ls = ((i − 1)%Kn ) × Nl 14: UA ((i − 1) × K + A¯c )=UL (ls + i − j + Kuc × (j − 1) + 1 : ls + i − j + Kuc × (j − 1) + 1 + Kuc − 1) 15: end for 16: end if 17: end for 18: for j = 1 to nu do 19: // collect bits from LDPC blocks as input bits Ac for the ith polar encoding. 20: if j == nu && kl ! = 0 then 21: if kl < Kn then 22: UA = Fsn (UL ) 23: end if 24: if kl ≥ Kn then 25: UA = Fbn (UL ) 26: end if 27: end if 28: if (j! = nu && kl ! = 0) || kl = 0 then 29: for i = (j − 1) × Kn + 1 to (j − 1) × Kn + Kn do 30: Acc ={1, 2, ..., i − Kn × (j − 1) − 1} 31: lc = (Acc − 1) × Nl 32: UA ((i−1)×K+Ac (Acc ))=UL (lc+i−j+Kuc ×j) 33: 34: 35: 36: 37: 38:

IV. S IMULATION R ESULT In this section, simulation results are provided to verify the performance of the CBI scheme shown in Algorithm 1. The

Acp ={(i − Kn × (j − 1) + 1), ..., Kn } lm = (Acp − 1) × Nl UA ((i − 1) × K + Ac (Acp − 1))=UL (lm + i − (j − 1) + Kuc × (j − 1)) end for end if end for

SUBMITTED PAPER

7

Original Input Bits Grouping

LDPC encoding

group1

group2

group3

group4

Ă

$c

$c

$c

$c

Ă

Interleaving

$ c $c Polar encoding

Polar group

Fig. 3. A general correlation-breaking interleaving scheme. Here the set Ac consists of the indices of the correlated bits and the set A¯c is the complementary set of Ac .

Algorithm 2 Function of [UA ] = Fusn (UL ). Collect bits from LDPC blocks as input bits A¯c for the ith polar encoding while kl < Kn . 1: for i = (nu − 1) × Kn + 1 to (nu − 1) × Kn + kl do 2: ls = ((i − 1)%Kn ) × Nl 3: UA ((i−1)×K + A¯c)= UL (ls+i−nu +Kuc ×(nu −1)+ 1 : ls+i−nu +Kuc ×(nu −1)+1+(Kuc −i%Kn −1)) 4: end for Algorithm 3 Function of [UA ] = Fubn (UL ). Collect bits from LDPC blocks as input bits A¯c for the ith polar encoding while kl ≥ Kn . 1: for i = (nu − 1) × Kn + 1 to (nu − 1) × Kn + Kn do 2: if i = (nu − 1) × Kn + 1 to (nu − 1) × Kn + 1 + dm then 3: ls = ((i − 1)%Kn ) × Nl 4: UA ((i−1)×K + A¯c )= UL (ls+i−nu +Kuc ×(nu − 1) + 1 : ls + i − nu + Kuc × (nu − 1) + 1 + Kuc − 1) 5: end if 6: if i = (nu −1)×Kn +1+dm +1 to (nu −1)×Kn +Kn then 7: ls = ((i − 1)%Kn ) × Nl 8: UA ((i − 1) × K + A¯c )= UL (ls + i − nu + Kuc × (nu − 1) + 1 : ls + i − nu + Kuc × (nu − 1) + 1 + (Kuc − (i − dm − 1)%Kn − 1)) 9: end if 10: end for Algorithm 4 Function [UA ] = Fsn (UL ). Collect bits from LDPC blocks as input bits Ac for the ith polar encoding while kl < Kn . 1: for i = (nu −1)×Kn +1 to (nu −1)×Kn +kl −dn ×ǫ(dn ) do 2: Acp ={(i − Kn × (nu − 1) + 1), ..., Kn } 3: lm = (Acp − 1) × Nl 4: UA ((i − 1) × K + Ac (Acp − 1))=UL (lm + i − (nu − 1) + Kuc × (nu − 1)) 5: end for

Algorithm 5 Function [UA ] = Fbn (UL ). Collect bits from LDPC blocks as input bits Ac for the ith polar encoding while kl ≥ Kn . 1: for i = (nu − 1) × Kn + 1 to (nu − 1) × Kn + Kn do 2: if i = (nu − 1) × Kn + 1 to (nu − 1) × Kn + 1 + dm then 3: Acc ={1, 2, ..., i − Kn × (nu − 1) − 1} 4: lc = (Acc − 1) × Nl 5: UA ((i − 1) × K + Ac (Acc ))=UL (lc + i − nu + Kuc × nu ) 6: Acp ={(i − Kn × (nu − 1) + 1), ..., Kn } 7: lm = (Acp − 1) × Nl 8: UA ((i − 1) × K + Ac (Acp − 1))=UL (lm + i − (nu − 1) + Kuc × (nu − 1)) 9: end if 10: if i = (nu −1)×Kn +1+dm +1 to (nu −1)×Kn +Kn then 11: Acp ={(i − Kn × (nu − 1) + 1), ..., Kn } 12: lm = (Acp − 1) × Nl 13: UA ((i − 1) × K + Ac (Acp − 1))=UL (lm + i − (nu − 1) + Kuc × (nu − 1)) 14: end if 15: end for

example we take is the same as the BI scheme in Fig. 2. All the LDPC codes used in this section is the (155,64,20) Tanner code [25]. Therefore Nl = 155 and Kl = 64. The polar code has the block length N = 256 and code rate R = 1/4. The underlying channel is the AWGN channel. The polar code construction is based on [3] which produces the set A. Then the submatrix GAA is formed from the generator matrix G256 . Based on the submatrix GAA and Proposition 2, the correlated set Ac (Kc = 36) and the un-correlated set A¯c (Kuc = 28) is obtained. Algorithm 1 is implemented with the following details. •





Consider the ith polar encoding block for 1 ≤ i ≤ Kc + 1 = 37. The information bits A¯c of the ith polar block is composed of bit i to (i+28−1) of the ((i−1)%37+1)th LDPC code block. The information bits Ac for the ith polar block are collected through two sets Acc and Acp with Acc = {i − 1, i − 2, ...1} and Acp = {37, 36, 35...i + 1}. These two sets Acc and Acp are the indices of LDPC blocks. The bits of Ac of the ith polar block are from two parts: the (i − 1 + 28)th bit of LDPC groups Acc and the ith bit of LDPC groups Acp . Consider the ith polar group for 38 ≤ i ≤ 74. The information bits A¯c for the ith polar code consists of bits (i − 2 + 28 + 1) to ((i − 2 + 28 + 1) + 28 − 1) of the ((i − 1)%37 + 1)th LDPC block. In this case Acc = {(i − 37) − 1, (i − 37) − 2, ...1} and Acp = {37, 36, 35, ...(i − 37) + 1}. Therefore the bits Ac of the ith polar code are from bit (i − 2 + 28 × 2) of LDPC groups Acc and bit (i − 1 + 28) of LDPC groups Acp . Now consider the ith polar group for 75 ≤ i ≤ 101. The bits A¯c of the ith polar code is made up of bits from (i−3+28×2+1) to ((i−3+28×2+1)+(28−i%37)−1)

8 10 0

10 -1

BER

10 -2

10 -3

10 -4

10 -5

10 -6

LDPC(155,64)+Polar(256,64)SC LDPC(155,64)+Polar(256,64)BP LDPC(155,64)+CBI+Polar(256,64)SC LDPC(155,64)+BI+Polar(256,64)SC

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

SNR(dB)

Fig. 4. The BER performance of polar code (256,64) concatenated with a LDPC code in AWGN channels. The LDPC code is the (155,64,20) Tanner code. 10 0

10 -1

10 -2

BER

of the ((i − 1)%37 + 1)th LDPC block. In this case, there is only Acp = {37, 36, 35, ...(i − 37 × 2) + 1}. The information bits Ac for the ith polar code are from bit (i − 2 + 28 × 2) of LDPC groups Acp . In this example, the occupied memory size is [101, 64], which is smaller than [155, 64]. In the mean time, the turnaround time is 101 × 256 symbols which is smaller than 155 × 256 symbols. The BER performance of the CBI scheme is shown in Fig. 4 where the dashed black line with circles is the performance of the CBI scheme. The legend for this scheme is: LDPC (155,64)+CBI+Polar (256,64)SC. Here SC means polar codes are decoded with the SC algorithm. The solid black line with circles is the the performance of polar code directly concatenated with the LDPC code (no interleaving being performed), with a legend of LDPC (155,64)+Polar (256,64)SC. Note that compared with the BI scheme (the solid red line with triangles and the legend LDPC (155,64)+BI+Polar (256,64)SC), the CBI scheme achieves a comparable BER performance while having a memory size Nl /np = 1.5 times smaller. At BER = 10−4 , the proposed CBI scheme only requires about 1dB increase of SNR compared with the BI scheme. Both the CBI and the BI scheme have a better BER performance than the direct concatenation. The advantage of the BI scheme comes from the fact that there is still a possibility of the un-correlated bits to be in error simultaneously (the coupling coefficient is not 1). To compare with a direct concatenation of polar codes (BP decoding) with LDPC codes, simulation for this scheme is carried out (shown in Fig. 4 by the solid black triangled line). The legend for this scheme is LDPC (155,64)+Polar (256,64)BP. The proposed CBI scheme of polar codes with SC decoding outperforms the LDPC (155,64)+Polar (256,64)BP scheme (direct concatenation of polar codes with BP decoding) with a lower computational complexity. To show the advantage of the CBI scheme, simulations with weaker codes are also conducted. Here by “weaker” we mean shorter or higher rate codes. The results are shown in Fig. 5. The solid red line with diamonds is the performance of the CBI scheme of polar code (32,8) concatenated with the LDPC code (21,8), with a legend of LDPC (21,8)+CBI+Polar (32,8). Compared with the BI scheme (the solid blue line with triangles and the legend LDPC (21,8)+BI+Polar (32,8)), the CBI scheme achieves almost the same performance. In this example, the occupied memory size of the CBI scheme is [15, 8] which is smaller than [21, 8] of the BI scheme. The same BER performance and the low occupied memory clearly show the advantage of the CBI scheme compared with the BI scheme. As discussed in Section III-B, some of the wasted bits in the CBI scheme can be turned to frozen bits to improve the system performance. This is shown by the red stared line in Fig. 5 with a legend of Improved LDPC (21,8)+CBI+Polar (32,8). It can be seen that by turning one wasted bit into a frozen bit, the BER performance of the CBI scheme reaches that of the BI scheme while still maintaining the same complexity as the original CBI scheme with the legend of LDPC (21,8)+CBI+Polar (32,8). Clearly, the CBI scheme with a stronger LDPC code will have a better BER performance than the CBI scheme with a weaker code in the

IEEE TRANSACTIONS ON COMMUNICATIONS

10 -3

10 -4 LDPC(155,64)+CBI+Polar(256,64) LDPC(21,8)+CBI+Polar(32,8) Improved LDPC(21,8)+CBI+Polar(32,8) LDPC(21,8)+BI+Polar(32,8) BCH(15,5)+CBI+Polar(32,8)

10 -5

10 -6

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

SNR(dB)

Fig. 5. The BER performance of the concatenation in AWGN channels with SC decoding. The LDPC code is the (21,8,6) Tanner code. The BCH code is (15,5).

large SNR region (beyond 0.5 dB). However, for application scenarios operating with SNRs below 0.5 dB, the CBI scheme with weaker codes is a good choice because of the good performance and the low complexity. To further reduce the complexity of the outer codes, BCH codes are employed to replace the LDPC codes. The CBI scheme with BCH as the outer code is shown in Fig. 5 by the solid line with circles and the legend for this scheme is BCH(15,5)+CBI+Polar(32,8). At BER=10−4 , this scheme requires a 1dB increase compared with short and long outer codes: LDPC (21,8) and LDPC (155,64), which is the cost of the lower decoding complexity. V. C ONCLUSION In this paper, we first propose the blind interleaving scheme (BI) to completely de-correlate the possible bit errors in the SC decoding process of polar codes. This BI scheme achieves a better BER performance than the state-of-the-art (SOA) concatenation of polar codes with LDPC codes while still maintaining the low complexity of SC decoding. To get a better balance between the performance and complexity,

SUBMITTED PAPER

9

a novel interleaving scheme, the correlation-breaking interleaving (CBI), is also proposed in this paper. Both the BI and CBI scheme have a much better performance than the SOA concatenation schemes. The CBI scheme 1) achieves a comparable BER performance as the BI scheme with a smaller memory size and a shorter turnaround time; 2) and enjoys a performance robustness with reduced block lengths. Tradeoff can therefore be made between the complexity and the performance between the BI scheme and the CBI scheme. Simulation results are provided which verified that the concatenation of polar codes with SC decoding with the CBI scheme achieves a better BER performance than the direct concatenation scheme of polar codes with the BP decoding. With the CBI scheme, concatenation of polar codes can achieve a satisfactory performance at a very low block length, which provides an efficient implementation option for polar codes.

[20] [21] [22] [23] [24] [25]

channels,” in Proc. IEEE International Conference on Communications (ICC), 2013, pp. 4337 – 4341. Y. Meng, L. Li, and Y. Hu, “A novel interleaving scheme for polar codes,” in Proc. IEEE Vehicular Technology Conference Fall (VTC-Fall), Sep. 18-21 2016, pp. 1–5. L. Li and W. Zhang, “On the encoding complexity of systematic polar codes,” in Proc. IEEE International System-on-Chip Conference (SOCC), Sep 2015, pp. 508–513. H. Vangala, Y. Hong, and E. Viterbo, “Efficient algorithms for systematic polar encoding,” IEEE Commun. Lett., vol. 20, no. 1, pp. 17–20, Jan 2016. L. Li, W. Zhang, and Y. Hu, “On the error performance of systematic polar codes,” [Online]. Available: http://arxiv.org/abs/1504.04133, 2015. J. Yeh, Real analysis: Theory of measure and integration, 2nd ed. World Scientific Publishing Co., 2006. R. M. Tanner, D. Sridhara, A. Sridharan, T. E. Fuja, and D. J. Costello, “LDPC block and convolutional codes based on circulant matrices,” IEEE Trans. Inf. Theory, vol. 50, no. 12, pp. 2966–2984, December 2004.

R EFERENCES [1] E. Arıkan, “Channel polarization: A method for constructing capacityachieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073, 2009. [2] ——, “Systematic polar coding,” IEEE Commun. Lett., vol. 15, no. 8, pp. 860–862, August 2011. [3] I. Tal and A. Vardy, “How to construct polar codes,” IEEE Trans. Inf. Theory, vol. 59, no. 10, pp. 6562–6582, Oct 2013. [4] R. Mori and T. Tanaka, “Performance of polar codes with the construction using density evolution,” IEEE Commun. Lett., vol. 13, no. 7, pp. 519–521, Jul. 2009. [5] P. Trifonov, “Efficient design and decoding of polar codes,” IEEE Trans. Commun., vol. 60, no. 11, pp. 3221–3227, November 2012. [6] D. Wu, Y. Li, and Y. Sun, “Construction and block error rate analysis of polar codes over AWGN channel based on Gaussian approximation,” IEEE Commun. Lett., vol. 18, no. 7, pp. 1099–1102, July 2014. [7] C. Zhang and K. K. Parhi, “Low-latency sequential and overlapped architectures for successive cancellation polar decoder,” IEEE Trans. Signal Process., vol. 61, pp. 2429–2441, May 2013. [8] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and W. Gross, “Fast polar decoders: Algorithm and implementation,” IEEE J. Sel. Areas Commun., vol. 32, no. 5, pp. 946–957, May 2014. [9] C. Zhang and K. K. Parhi, “Latency analysis and architecture design of simplified SC polar decoders,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 61, pp. 115–119, 2014. [10] S. Hassani, R. Mori, T. Tanaka, and R. Urbanke, “Rate-dependent analysis of the asymptotic behavior of channel polarization,” IEEE Trans. Inf. Theory, vol. 59, no. 4, pp. 2267–2276, 2013. [11] S. H. Hassani, K. Alishahi, and R. Urbanke, “Finite-length scaling of polar codes,” IEEE Trans. Inf. Theory, vol. 60, no. 10, pp. 5875–5898, October 2014. [12] E. Arıkan, “A performance comparison of polar codes and reed-muller codes,” IEEE Commun. Lett., vol. 12, no. 6, pp. 447–449, 2008. [13] N. Hussami, S. Korada, and R. Urbanke, “Performance of polar codes for channel and source coding,” in Proc. IEEE International Symposium on Information Theory (ISIT), June 2009, pp. 1488–1492. [14] A. Eslami and H. Pishro-Nik, “A practical approach to polar codes,” in Proc. IEEE International Symposium on Information Theory (ISIT), 2011, pp. 16–20. [15] M. Mondelli, S. H. Hassani, and R. L. Urbanke, “From polar to reedmuller codes: A technique to improve the finite-length performance,” IEEE Trans. Commun., vol. 62, no. 9, pp. 3084–3091, Sept 2014. [16] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inf. Theory, vol. 61, no. 5, pp. 2213–2226, May 2015. [17] K. Chen, K. Niu, and J. Lin, “Improved successive cancellation decoding of polar codes,” IEEE Trans. Commun., vol. 61, no. 8, pp. 3100–3107, August 2013. [18] J. Guo, M. Qin, A. G. i Fabregas, and P. H. Siegel, “Enhanced belief propagation decoding of polar codes through concatenation,” in Proc. IEEE International Symposium on Information Theory (ISIT), 2014, pp. 2987 – 2991. [19] U. U. Fayyaz and J. R. Barry, “Polar codes for partial response

Ya Meng (S’16) is currently pursuing the B.S. degree in the School of Electronics and Information Engineering, Anhui University. Her research interest is in polar codes. Specifically, her research is to find the error propagation pattern of the SC decoding of polar codes, and to use this error pattern to improve the performance of the SC decoding. She was also awarded the Third Place of the Ninth International Students’ Innovation and Entrepreneurship Competition (I CAN).

Liping Li (S’07-M’15) is now an associate professor of the Key Laboratory of Intelligent Computing and Signal Processing of the Ministry of Education of China, Anhui University. She got her PhD in Dept. of Electrical and Computer Engineering at North Carolina State University, Raleigh, NC, USA, in 2009. Her current research interest is in channel coding, especially polar codes. Dr. Li’s research topic during her PhD studies was multiple-access interference analysis and synchronization for ultrawideband communications. Then she worked on a LTE indoor channel sounding and modeling project in University of Colorado at Boulder, collaborating with Verizon. From 2010 to 2013, she worked at Maxlinear Inc. as a staff engineer in the communication group. At Maxlinear, she worked on SoC designs for the ISDB-T standard and the DVB-S standard, covering modules on OFDM and LDPC. In Sept. 2013, she joined Anhui University and started her research on polar codes until now.

10

Chuan Zhang (S’07-M’13) is now an associate professor of National Mobile Communications Research Laboratory, School of Information Science and Engineering, Southeast University, Nanjing, China. He received B.E. degree in microelectronics and M.E. degree in VLSI design from Nanjing University, Nanjing, China, in 2006 and 2009, respectively. He received both M.S.E.E. degree and Ph.D. degree in Department of Electrical and Computer Engineering, University of Minnesota, Twin Cities (UMN), USA, in 2012. His current research interests include low-power high-speed VLSI design for digital signal processing and digital communication, bio-chemical computation and neuromorphic engineering, and quantum communication. Dr. Zhang is a member of Seasonal School Committee of the IEEE Signal Processing Society, Circuits and Systems for Communications (CASCOM) TC, VLSI Systems and Applications (VSA) TC, and Digital Signal Processing (DSP) TC of IEEE CAS Society. He was also a (co-)recipient of Best Paper Award of IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) in 2016, Best Student Paper Award of IEEE International Conference on Digital Signal Processing (DSP) in 2016, Excellent Paper Award and Excellent Poster Presentation Award of International collaboration Symposium on Information Production and Systems (ISIPS) in 2016, Best Student Paper Award of IEEE International Conference on ASIC (ASICON) in 2015, the Best Paper Award Nomination of IEEE Workshop on Signal Processing Systems (SiPS) in 2015, the Merit Student Paper Award of IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) in 2008, Three-Year University-Wide Graduate School Fellowship of UMN, Doctoral Dissertation Fellowship of UMN.

IEEE TRANSACTIONS ON COMMUNICATIONS