Polar Codes for Channels with Deletions

2 downloads 0 Views 1010KB Size Report
Abstract— A polar coding scheme for channels with deletions has been proposed recently, in which the information bits are pre-coded with CRC that helps to ...
Polar Codes for Channels with Deletions Kuangda Tian† , Arman Fazeli‡ , Alexander Vardy‡ , Rongke Liu† of Electronic and Information Engineering, Beihang University, Beijing, 100191, P. R. China ‡ Department of Electrical & Computer Engineering, University of California San Diego, CA 92093, USA Email: {kuangda tian, rongke liu}@buaa.edu.cn, {afazelic, avardy}@ucsd.edu † School

Abstract— A polar coding scheme for channels with deletions has been proposed recently, in which the information bits are pre-coded with CRC that helps to detect the location of deletions with high precision. Successive Cancellation (SC) decoding then treats these symbols as simple erasures. Given d as the number of deleted symbols, the decoding algorithm requires to check all ( Nd ) combinations of the deleted locations to find one that agrees with the CRC. This escalates the overall decoding complexity to O( N d+1 log N ), which is not practical even when d is a small number. In this paper, we propose an alternative decoding method for polar codes in presence of deletion errors. This method can be regarded as an extension of SC decoding to the deletion channel. The proposed algorithm is based on the recursive structure of polar codes and it directly adopts the outputs of deletion channel to perform decoding without any preprocessing. In other words, it is no longer required to check all ( Nd ) possible locations of the deletions. Instead, each node in the proposed polar decoder propagates its uncertainty about deletion pattern to the nodes in the next decoding layer. Eventually, with high probability, the correct deletion pattern becomes visible when the last polar bit-channel is decoded. The resulting decoding complexity is only O(d2 N log N ), which scales polynomially rather than exponentially with the number of deletions.

I. I NTRODUCTION The invention of polar codes by E. Arıkan in 2009 is considered as one of major breakthroughs in coding theory as they are the first family of practical codes proven to achieve the symmetric capacity of discrete memoryless channels with reasonable encoding and decoding complexities [1]. The butterfly-like structure of polar graph allows a simple yet very efficient implementation of the Successive Cancellation (SC) decoder, which is the first low-complexity polar decoder. Multiple alternative decoding algorithms were proposed since then; But eventually, it was the Successive Cancellation List (SCL) decoder that outperformed its main competitors, namely LDPC and Turbo codes, and put polar codes on the map [2]. Nowadays, polar codes even found their way into wireless communication standards such as 5G (see [3]). It is common to pre-code the information with some Cyclic Redundancy Check (CRC) when polar decoder is equipped with the list decoder. The list decoder can use the extra information to eliminate all codeword candidates that do not checkout with CRC and achieve a much lower error rate. Utilizing CRCs even with very high rates makes it incredibly rare to find more than two or more candidates that agree with CRC. Similarly, one may benefit from the CRC to reveal the correct received vector from the channel, when decoder is

provided with more than one channel observations. A naive application would be when a dummy symbol is inserted into the codeword or when a symbol is missing, which would force the decoder to consider multiple options and apply the decoding algorithm on all of them. Synchronization problems in communication system may result in losing a few symbols of the received vector or sometimes observing a few unwanted random symbols among the received ones. These types of error are usually referred to by insertion or deletion errors [4]. Channels corrupted by insertions or deletions have memory, hence the techniques developed for memoryless channels with additive noise can not be performed straightforwardly [5]. While polarization theorems in general hold for some examples of channels with memory (see [6]), they do not immediately apply to the deletion channel. Indeed, even the capacity of the deletion channel is not fully known [4] [7]. E. K. Thomas et al. proposed a decoding scheme for polar codes over binary erasure channel (BEC) with a limited number of deletion error [8] that utilizes the pre-coded CRC to detect the correct location of deleted symbols. Assume decoder realizes that d out of N symbols are deleted. As depicted in Figure 1, decoder first selects one of the possible ( Nd ) scenarios for the location of the deleted symbols, and then proceeds with the conventional SC decoder while treating the deleted symbols as simple erasures. Upon completion of the decoding algorithm, the estimated information bits are compared to the CRC, and if test results are negative, decoder starts again with a different deletion pattern. It is now clear why this scheme becomes infeasible when then the number of deletions grows, as the overall decoding complexity of polar codes is multiples by a factor of ( Nd ). One can verify that the average asymptotic decoding complexity will be given by O( N d+1 log N ), where d denotes the number of deletions. Based on Reed-Solomon codes, Serge Kas Hanna and Salim El Rouayheb [9] proposed guess and check codes for deletion channel. The basic idea is almost the same as the above method: enumerating all possible deletion patterns and then performing decoding for each cases. The decoding d+2 complexity is O( k d ), where k is the length of information log k bits. In this paper, we propose an alternative decoding method for polar codes in presence of additional deletion errors. We also prove it to be the correct implementation of polar

u0N 1

uˆ0N 1

Polar Encoder

CRCcheck

x0N 1

y0N  d 1

d-Deletion

SC decoder

yˆ 0N 1

calculations. Simulation results are then presented in section V. We also show some numerical results to demonstrate the polarization phenomena in presence of the deletion error. Its proof however is left for a future work. II. P RELIMINARIES

Insert d erasures

If No, select a different pattern Traverse all possible insert positions Fig. 1. Polar coding scheme proposed in [8] for channels with deletion. The decoding complexity is increased with a factor of ( Nd ) which makes it impractical when d grows.

successive cancellation decoder. The proposed algorithm is based on the recursive structure of polar codes and it directly adopts the outputs of deletion channel to perform decoding without any preprocessing. In other words, it is no longer required to check all ( Nd ) combinations of the deletion patterns. Instead, each node also propagates its uncertainty about deletion pattern to the next layer, and the correct deletion pattern becomes visible when the last polar bit channel is decoded. By doing so we can reduce the decoding complexity back to O(d2 N log N ). The secret behind our method lies within the definition of polar bit-channels. As fully explained in the following sections, when estimating each node in the polar trellis, we only require a subset of the received symbols. More importantly, these symbols always form a consecutive subvector of the received symbol. Therefore, in order to allocate some symbols to these bit-channels, all we need to know is the number of deletion before and after the corresponding interval, which forms at most (d + 1)(d + 2)/2 different scenarios. We propose a smart implementation of the Successive Cancellation decoding that organizes these nodes in a reusable way to prevent duplicate computation, which reduces the overall decoding complexity to O(d2 N log N ). Besides, the proposed decoding algorithm can be easily combined with the existing improvement techniques developed for conventional SC decoding, e.g., SCL with CRC to further enhance the decoding performance. The similar approach can be taken to perform SC decoding for channels with insertion noise, or simultaneous deletion and insertion errors. The simulation results reveal that the polarization phenomena still occurs in presence of these types of noises. Paper Outline. The rest of this paper is organized as follows. In section II, we provide some preliminaries about polar codes, SC decoding, and the deletion channel. In section III, we discuss the theory of the proposed decoding algorithm in details, prove its correctness, and provide some upper bounds for the decoding complexity. Next, we discuss the implementation of the SC decoder for deletion channel in section IV by providing some high-level pseudo-codes, and the memory management techniques which prevent duplicate

A. Polar codes Let us quickly review the encoding structure of a ( N, k) polar code, where N = 2n denotes the code-length and k denotes the number of information bits. The polar codewords is defined as x0N −1 , u0N −1 GN , where x is the polar codeword, u is the length-N uncoded information vector, and Gn denotes the generator matrix of polar codes. The ⊗n generator matrix of polar codes is GN = BN 11 01 , where n ∈ Z. BN is an N × N permutation matrix and ‘⊗’ denotes the Kronecker product where N = 2n . Through channel polarization, we have N polarized bitchannels [1]. There are two types of bit-channels. Some bitchannels are less noisy while others are noisy channels. The less noisy ones form A. To determine the level of noise, we can use multiple different algorithms like evolution (DE) [10], Gaussian approximation (GA) [11], or Monte-Carlo. But it is shown that the construction algorithm proposed in [12] is the most accurate one, and works with lower rates as well. B. Successive Cancellation (SC) Decoding Let y˜ 0N −1 be the outputs of the B-DMCs when the inputs are x0N −1 . Let uˆ 0N −1 be the estimates of u0N −1 According to SC rule, the likelihood of ui can be defined as [1]: ˜ (i) ( y˜ N −1 , ui−1 |ui ) = W 0 0 N

N −1

1 2 N −1

∑ ∏ W˜ ( y˜ j |x j ),

(1)

uiN+−1 1 j=0

˜ ( y˜ j | x j ) is the transition probability of B-DMCs. where W We can rewrite (1) into joint probability format: ˜ (i) ( y˜ N −1 , ui−1 , ui ) = W 0 0 N

N −1

∑ ∏ W˜ ( y˜ j , x j ).

(2)

uiN+−1 1 j=0

The SC decoding uses the FFT-like structure of polar codes to calculate and propagate recursively the likelihood of ui over a multi-layer graph to efficiently estimate ui where i = 0, · · · , N − 1 one by one in the ascending ˜ (i) ( y˜ N −1 , u˜ i−1 , ui ) can be recursively order. Therefore, W 0 1 N calculated by ˜ (2i) ( y˜ 2N −1 , u2i−1 , u2i ) W 0 0 2N

=



u2i+1

˜ (i) ( y˜ N −1 , u2i−1 ⊕ u2i−1 , u2i ⊕ u2i+1 ) W 0 0,e 0,o N (i )

˜ ( y˜ 2N −1 , u2i−1 , u2i+1 ), ·W 0,o N N

(3)

˜ (2i+1) ( y˜ 2N −1 , u2i , u2i+1 ) W 0 0 2N ˜ (i) ( y˜ N −1 , u2i−1 ⊕ u2i−1 , u2i ⊕ u2i+1 ) =W 0 0,e 0,o N (i )

˜ ( y˜ 2N −1 , u2i−1 , u2i+1 ). ·W 0,o N N

(4)

If j ∈ Ac , we directly set uˆ j = u j , else we have  ˜ (i) ( y˜ N −1 ,uˆ i−1 ,ui =0) W  0 1 N 0, >1 ˜ (i) ( y˜ N −1 ,uˆ i−1 ,ui =1) uˆ i = . W 0 1 N  1, otherwise

(5)

For more details about SC decoding, please refer to [1]. SC decoding can be efficiently represented by a trellis structure. For polar codes with length N, there are N (1 + log N ) nodes in the graph. The leftmost column of nodes calls of the recursive function with a length-N input vector, and the second column with length N /2 and so on [1]. The decoding trellis of polar codes with N = 8 is shown in Figure 2. In the dashed rectangle, the nodes in the upperlevel are denoted as A1 , A2 , and the nodes in the lower-level are denoted as B1 , B2 . For nodes A1 , the calculation of likelihood involves y˜ 54 . For memoryless channel, y˜ 54 correspond to x54 . Therefore, we call that the coded bits corresponding to node A1 are x54 . In this way, the coded bits corresponding to A2 , B1 , B2 are x54 , x4 , x5 , respectively.

Rx

Bit-channel W8(0) ( y07 , u0 )

W ( y0 , x0 )

W8(4) ( y07 , u03 , u4 )

W ( y1 , x1 )

W8(2) ( y07 , u01 , u2 )

W ( y2 , x2 )

W8(6) ( y07 , u05 , u6 )

is unknown to receiver [4]. Some scholars also define that each coded bit can be deleted with a probability Pd ∈ (0, 1). The capacity of deletion channel is unknown. Obviously, the upper bound is 1 − Pd if we know the positions of the deleted symbols then the channel model will become binary erasure channel. The lower bound is [7]: Cdel > 1 − H (1 − Pd ),

(6)

for Pd < 0.5, where H (·) is binary entropy function. For Pd > 0.5, the lower bound is [4]: Cdel > (1 − Pd )/9.

(7)

Since the receiver knows the number of deleted bits, the deletion channel with probability can be also considered as channel with deletion of constant number. In this paper, we focus on the d-deletion channel, but the results can be easily extended to sticky channel.

III. S UCCESSIVE C ANCELLATION D ECODING OF P OLAR C ODES FOR D ELETION C HANNEL (T HEORY )

W ( y3 , x3 ) 5 4

W ( y , u1 A1

W8(1) ( y07 , u0 , u1 )

W8(5) ( y07 , u04 , u5 )

u3 )

A2

B1 W ( y4 , x4 ) B2 W ( y , x ) 5 5

W8(3) ( y07 , u02 , u3 )

W ( y6 , x6 )

W8(7) ( y07 , u06 , u7 )

W ( y7 , x7 )

Recursion

For d-deletion channel, not all N received symbols y˜ 0N −1 are available. The conventional approach was to fix the mapping for each of the ( Nd ) possible mappings first, and then begin SC decoding. However, we propose conveying the uncertainty of the mapping to the upper layers and use the help of the frozen bits to determine the correct mapping on the go. By doing so, we no longer require to go through all mappings, and hence reduce the overall computational complexity dramatically. According to conventional SC decoding, if we decode ui by y0N −d−1 directly, we have:

Fig. 2. Decoding trellis of a polar code with length N = 8 as appears in [1]. Dashed rectangle focuses on the interaction between two layers. (i )

WN ( y0N −d−1 , ui0−1 , ui ) (i )

C. Deletion channel u0N 1

Polar Encoder

x0N 1

y0N  d 1

d-Deletion

Polar Decoder

uˆ0N 1

Fig. 3. Channel model in presence of d-deletions. The input and output are x0N −1 and y0N −d−1 ∈ {0, 1} N −d , respectively. Receiver only receives N − d coded symbols.

The d-deletion channel is shown in Figure 3. For ddeletion channel, the outputs are y0N −d−1 when the inputs are x0N −1 . d coded bits are deleted and the positions of deletion

d = WN ( y0N −d−1 , ui0−1 , ui | D0:N −1 )

(8)

d P( y0N −d−1 , u0N −1 | D0:N −1 ) ,

(9)

=



uiN+−1 1

which can be also recursively calculated. The specific method is stated in Theorem 1.

d Theorem 1. For any d-deletion channel, Let Da:b be the event that d coded bits are deleted among xba .

(i )

WN ( y0N −d−1 , ui0−1 , ui ) can be calculated recursively by (2i)

d1 + d2 6 d1 + d2 + d3 = d.

−d−1 2i−1 d W2N ( y2N , u0 , u2i | D0:2N −1 ) 0

=



t∈{0,...,d}

·{

d t d−t P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 )

All possible scenarios are listed in Table I. To label each of these scenarios, we first fix d2 , and then traverse all the possible value of d1 . Then we change d2 and traverse all the possible value of d1 again. For the tth scenarios, we have

(i )



u2i+1

(13)

−t −d−1 2i−1 , u0,o , u2i+1 | D dN:2N WN ( y2N −1 ) N −t

(i )

2i −1 2i −1 t · WN ( y0N −t−1 , u0,o ⊕ u0,e , u2i ⊕ u2i+1 | D0:N −1 )} , (10)

(2i+1)

W2N

=

d ( y02N −d−1 , u2i 0 , u2i +1 | D0:2N −1 )



t∈{0,...,d}

t d−t d P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 )

(i )

−d−1 2i−1 −t · {WN ( y2N , u0,o , u2i+1 | D dN:2N N −t −1 ) (i )

2i −1 2i −1 t · WN ( y0N −t−1 , u0,o ⊕ u0,e , u2i ⊕ u2i+1 | D0:N −1 )} (11)

where

max(0, d − ( N − (b − a + 1))) 6 d2 6 min(d, b − a + 1) (15) Therefore, the worst case of decoding complexity is O(d2 N log N ). The exact number of scenarios is the number of solutions of (16):

t d−t d P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 ) t d = P( D0:N , D d−t −1 | D0:2N −1 )  −1 N:2N   N N 2N = / t d−t d N! N! · ( N −t)!t! ( N −d+t)!(d−t)! = , 2N!

(2d + 3 − d2 )d2 + d1 + 1, (14) 2 which will be denoted by hd1 , d2 i2 in the following pseudocodes. The number of scenarios are (d + 1)(d + 2)/2. However, when b − a + 1 < d or a − 1 < d, the number of possible scenarios is much less than (d + 1)(d + 2)/2. Based on Eq. (13), we can get the following stronger constraint for d2 : t=

(12)

s.t.

d!(2N −d)!

which can be considered as scaling parameter. The recursion is terminated when N = 1. The proof of Theorem 1 is in the Appendix. The basic recursive structure of the proposed decoding algorithm is almost the same with conventional SC decoding, which means that the feature of low-complexity is conserved and the proposed decoding algorithm can be also represented by trellis structure. Given a node in the trellis of conventional SC decoding, assume that the coded bits corresponding to this node are xba , where b − a > d. When d coded bits are deleted, there are at most (d + 1)(d + 2)/2 mapping rules between xba and received symbols. We call each mapping rule as a scenario. x0

x1

x0a 1

xa

... xab

xb

xN  2 xN 1

xbN11

Fig. 4. The length-N codeword is partitioned into three segments: x0a−1 , xba , and xbN+−11 . All we need to determine the sub-vector of y0N −d−1 in the receiver that belongs to xba is the number of deleted symbols in each segments, i.e. d1 , d2 , and d3 .

Firstly, we split x0N −1 into three parts. In Fig. 4, we have = ( x0a−1 , xba , xbN+−11 ). Let d1 , d2 , d3 be the number of deleted symbols among x0a−1 , xba , xbN+−11 , respectively. We have: x0N −1

d1 + d2 + d3 = d

(16)

0 6 d1 6 a,

(17)

0 6 d2 6 b − a + 1,

(18)

0 6 d3 6 N − b − 1.

(19)

Example 1, The basic structure of the proposed decoding algorithm is shown in Figure 5. The length of polar codes is 8 and one coded bit is deleted. In this case, we have t = 3. That means each node in the decoding trellis of the TABLE I A LL POSSIBLE SCENARIOS OF NODE IN DECODING TRELLIS . T HIS TABLE ALSO ESTABLISHES THE LABELING OF THESE SCENARIOS WITH RESPECT TO

< d1 , d2 >2 .

Scenarios

d2

d1

Received symbols corresponding to xba

1 2

0 0 .. . 0 1 1 .. . 1 2 .. . d−1 d−1 d

0 1 .. . d 0 1 .. . d−1 0 .. . 0 1 0

yba −1 yba− 1 .. . −d yba− d b y a−1 −2 yba− 1 .. .

d+1 d+2 d+3

2d + 1 2d + 2

d(d + 3)/2 − 1 d(d + 3)/2 (d + 1)(d + 2)/2

−d yba− d+1 yba−2 .. . yba−d+1 −d yba− 1 yba−d

Scenario

Scenario d1  0, d 2  0 W ( y | u1  u3 ) 5 4

d1  1, d 2  0 W ( y | u1  u3 ) 4 3

d1  0, d 2  1 W ( y4 | u1  u3 )

A1,1

f f

A1,2

B1,1

d1  0, d 2  0 W ( y4 | x4 )

B1,2

d1  1, d 2  0 W ( y3 | x4 )

B1,3

d1  0, d 2  1 W ( y | x4 )

0.5 : 0.5

A1,3



f

Scenario B2,1

d1  0, d 2  0 W ( y5 | x5 )

B2,2 d1  1, d 2  0 W ( y4 | x5 )

f

B2,3

d1  0, d 2  1 W ( y | x5 )

Fig. 5. The substitute of the dashed rectangle in Figure 2 in presence of d = 1 deletion. The small cut of the trellis structure in proposed SC decoding algorithm illustrates re-usability of the calculations in each node for multiple nodes at the next layer. For example, node B2,2 is both A1,2 and A1,3 .

conventional SC decoding algorithm has at most 3 scenarios. In Figure 5, A1,1 , A1,2 , A1,3 are the three scenarios of A1 in Figure 2. Algorithm 1 Successive cancellation for d-deletion Require: received vector y0N −d−1 Ensure: decoded uˆ 0N −1 1: Initialize matrix Pn [ N ][(d + 1)(d + 2)/2][2] with all-zero elements; 2: for β = 0 → N − 1 do //Initialization 3: for d2 = 0 → 1 do 4: for d1 = 0 → d − d2 do 5: if d2 6= 0 then 6: if d1 6 β then 7: P0 [h0, βi][hd1 , d2 i2 ][0] = W ( yβ−d1 |0); 8: P0 [h0, βi][hd1 , d2 i2 ][1] = W ( yβ−d1 |1); 9: end if 10: else 11: P0 [h0, βi][hd1 , d2 i2 ][0] = 1; 12: P0 [h0, βi][hd1 , d2 i2 ][1] = 1; 13: end if 14: end for 15: end for 16: end for 17: for φ = 0 → N − 1 do//Main loop recursivelyCalcP(n,φ) 18: if uˆφ is frozen then 19: set Bm [hφ, 0i] to the frozen value 20: else 21: if Pm [hφ, 0i][(d + 1)(d + 2)/2][0] > Pm [hφ, 0i][(d + 1)(d + 2)/2][1] then 22: set Bm [hφ, 0i] ← 0 23: else 24: set Bm [hφ, 0i] ← 1 25: end if 26: end if 27: if φ mod 2 = 1 then 28: recursivelyUpdateB(n,φ); 29: end if 30: end for 31:

Output uˆ 0N −1 = ( B0 [h0, βi])βN=−01

Algorithm 2 recursivelyUpdateB(λ,φ) 1: Require φ is odd 2: set ψ ← bφ / 2c 3: for β = 0, 1, ..., 2n−λ − 1 do L 4: Bλ −1 [hψ, 2βi] ← Bλ [hψ − 1, βi] Bλ [hψ, βi] 5: Bλ −1 [hψ, 2β + 1i] ← Bλ [hψ, βi] 6: end for 7: if ψ mod 2 = 1 then 8: recursivelyUpdateB(λ − 1,ψ); 9: end if

Algorithm 3 recursivelyCalcP(λ,φ) 1: if λ = 0 then // Stopping condition 2: return; 3: end if 4: set ψ ← bφ / 2c 5: if ψ mod 2 = 0 then 6: recursivelyCalcP(λ − 1,ψ); 7: end if 8: for β = 0, 1, ..., 2n−λ − 1 do 9: if φ mod 2 = 0 then UpdatePfnode( Pλ [hφ, βi], Pλ −1 [hψ, 2βi], 10: Pλ −1 [hψ, 2β + 1i], d, 2λ −1 ); 11: else 12: set u0 ← Bλ [hφ − 1, βi]; UpdatePgnode( Pλ [hφ, βi], Pλ −1 [hψ, 2βi], 13: Pλ −1 [hψ, 2β + 1i], d, 2λ −1 , u0 ); 14: end if 15: end for

Algorithm 4 UpdatePfnode(nup, nlo1, nlo2, num d, l) 1: for d2 = max(0, num d − ( N − 2l )), ..., min(num d, 2l ) do//d2 is the number of deletion among this part 2: for d1 = 0, 1, ..., num d − d2 do//d1 is the number of deletion before this part 3: t = hd1 , d2 i2 ; 4: for k = max{d2 − l, 0}, ..., min{d2 , l } do//k is the number of deletion among the upper part 5: t1 = hd1 , k i2 ; 6: t2 = hd1 + k, d2 − ki2 ; 7: α = ScalingPara(k, d2 , l ); 8: nup[t][0] = nup[t][0] + (nlo1[t1 ][0] · nlo2[t2 ][0] + nlo1[t1 ][1] · nlo2[t2 ][1]) ·α; 9: nup[t][1] = nup[t][1] + (nlo1[t1 ][0] · nlo2[t2 ][1] + nlo1[t1 ][1] · nlo2[t2 ][0]) · α; 10: end for 11: end for 12: end for

2N x = ( Nt )(dN −t)/( d ) return x;

IV. S UCCESSIVE C ANCELLATION D ECODING OF P OLAR C ODES FOR D ELETION C HANNEL (I MPLEMENTATION ) Algorithms 1-5 provide the high-level implementation of the SC decoding for channels with d-deletions. For simplicity, we will follow the notations and descriptions used in [2]. Algorithm 1 begins with initializing the probabilities associated with channel observation, i.e. the lowest layer in the polar trellis. Let us store these probabilities in a 3dimensional array Pn [ N ][(d + 1)(d + 1)/2][2] to be consistent with [2]. The first dimension corresponds to the location of the node among x0N −1 . We sometimes denote the node location by < φ, β > which determines the phase and layer number of the node, and can be transformed into a number between 0 and N − 1. We again refer the reader to [2] for detailed definitions of phase and layer number. The second dimension corresponds to the deletion scenario effecting this node (see Table I). And the last dimension corresponds to value in {0, 1} that its probability is being estimated. Let us elaborate more on steps 2:16 to better understand the initialization process for the rightmost layer of the polar trellis, i.e. the mth layer aka channel observation layer. For each index β ∈ {0, N − 1}, we have multiple deletion scenarios. Let d1 , d2 , and d3 denote the number of deletions before β, on β and after β. It is clear that if d2 = 1, then we are allocating a deletion to location β and hence Pn [β][ j][0] = Pn [β][ j][1] = 1 for j = d + 1, · · · , 2d (see steps 11-12). Furthermore if d2 6= 1, we should map yβ−d1 to xβ and initial the probabilities for xβ with respect to (β − d1 )th observed symbol (see steps 7-8). From step (17) ∼ (30), we start decoding uφ , where φ = 0, ..., N − 1. The following basic recursive structure is the same as conventional SC decoding except the updating rule for the probability matrix P, which P is shown in Algorithm 3.

k deletions among xa( a b 1)/2

x a b 1 2

nlo1

d 2 deletions among xab

d1  k deletions ahead x a b 1 2

...

1: 2:

xa

...

Algorithm 6 ScalingPara(t, d, l)

d1 deletions ahead

d1 deletions ahead xa

...

Algorithm 5 UpdatePgnode(nup, nlo1, nlo2, num d, u0 , l) 1: for d2 = max(0, d − ( N − 2l )), ..., min(num d, 2l ) do 2: for d1 = 0, 1, ..., num d − d2 do 3: t =indexCal(d2 ,d1 ,num d); 4: for k = max{d2 − l, 0}, ..., min{d2 , l } do 5: t1 = hd1 , k i2 ; 6: t2 = hd1 + k, d2 − ki2 ; 7: α = ScalingPara(k, d2 , l ); 8: nup[t][0] = nup[t][0] + nlo1[t1 ][u0 ] · nlo2[t2 ][0] · α; L 9: nup[t][1] = nup[t][1] + nlo1[t1 ][u0 1] · nlo2[t2 ][1] · α; 10: end for 11: end for 12: end for

xb

d 2  k deletions among x(ba b 1)/2

xb

nup

nlo2

Fig. 6. Splitting of calculation of upper layer nodes. Note that the channel observation sub-vectors of the lower nodes, i.e. nlo1 and nlo2, are next to each other, and merge to form the channel observation sub-vector of the node in the next layer, i.e. nup.

Matrix B, which stores the hard and finalized values of the nodes in trellis, is updated in similar fashion at conventional SC decoding method. Hence, Algorithm 2 is an almostduplicate of Algorithm 4 in [2]. To be consistent with prior work in polar codes, we refer to the two different recursive formulas by the f node and g node. Calculation of the probabilities in f nodes vary from the g nodes, where the latter depends the hard value of f . These recursive update rules are presented in Algorithms 4 and 5. Let nup be the node at upper layer whose probability will be calculated by two nodes, defined as nlo1 and nlo2, at low layer. Assume that the coded bits corresponding to nup is xba . Then the coded bits corresponding to nlo1 and ( a+b−1)/2 nlo2 are x a and xb(a+b+1)/2 , respectively. k is defined ( a+b−1)/2

as the number of deletion among x a . Therefore, the number of deletion among xb(a+b+1)/2 is d2 − k. The number ( a+b−1)/2

of deletion before x a is still d1 . And the number of deletion before xb(a+b+1)/2 is d1 + k. Then we calculate all possible scenarios of nup. The splitting process is shown in Figure 6. When combining different deletion scenarios whether we are looking at a f node or a g node, one should note that different scenarios occur with different probabilities. Therefore, we should scale each scenario according to its likelihood prior to adding them up. Algorithm 6 is used to calculate the scaling parameter. Note that, when we fix d, scaling parameter is independent of y0N −d−1 . Therefore, we can calculate all scaling parameters involved in decoding process in advance and then store them in memory to further save in real-time decoding complexity.

V. S IMULATION R ESULTS In this section, we present the simulation results of polar codes under proposed SC decoder for the deletion channel. Here, we consider the deletion error as the only source of the noise. However as explained earlier, the similar program

0.45 N=2048 N=1024 N=512 N=256

Error probability of bit-channel

0.4 0.35 0.3 0.25 0.2

N=256

0.15 0.1

N=2048

0.05 0 0

0.2

0.4

0.6

0.8

1

Proportion of sorted indices

Fig. 8. Error probability of individual bit-channels when sorted in the ascending order for the deletion channel with fixed deletion probability. Some signs of polarization can be observed, i.e. as the code length becomes larger, more channels polarize to the noiseless state. 100

10-2

BLER

can be used for the case when deletion error is combined with additive noise, erasure, or symmetric flips. It is to be noted that since deletion channel is not a BDMC, conventional constructional algorithms such as density evolution, Gaussian approximation, or the Tal-Vardy method in [12] cannot be used directly. Here we construct the polar codes with small modifications on the Monte-Carlo method, which in general required larger computational capacity to achieve the same precision as the modern methods. Also note that since the channel is no longer memoryless, we cannot set all frozen bits to 0. Frozen bits in our simulations are set as random, and are known to both encoder and decoder. Figure 7 shows the relation between Rate (R) and the Block Error probability (BLER) for polar codes of different code lengths, N = 256, 512, 1024, and 2048. Each coded bit is then deleted with probability 0.002, which makes the length of the received vector variable. As expected, polar codes perform better with larger code-lengths. Figure 8 illustrates the noise level of the polar bit-channels when sorted in the ascending order. It is observed that as the code length increases, more bit-channels become noiseless, which is a sign of the polarization phenomena. However, proving the polarization theorems for this case is far from trivial since flexibility on d also increases with the code length. Figures 9 and 10 correspond to the similar simulation setup, when the number of deletions is fixed to d = 4. It is important to notice that the polarization phenomena is more bold in this setup, which makes us conjecture their correctness when d is fixed.

N=2048 N=1024 N=512 N=256

N=256

10-4

10-6

N=2048

100 N=2048 N=1024 N=512 N=256

BLER

10

10-8 0.4

0.5

0.6

0.7

0.8

0.9

1

Rate

-2

N=256

Fig. 9. Performance of polar codes under proposed SC decoder for channel with fixed number of deletions. Polar codes with larger lengths perform strictly better particularly at all rates.

10-4

0.6 10-6

N=1024

0.6

0.7

0.8

0.9

1

Rate

Fig. 7. Performance of polar codes under proposed SC decoder for channel with fixed deletion probability. Polar codes with larger lengths perform strictly better particularly at lower rates.

Error probability of bit-channel

10-8 0.5

N=2048

N=2048

0.5

N=512 N=256

0.4

0.3 N=256 0.2

0.1 N=2048

A PPENDIX Proof: (2i)

−d−1 2i−1 d W2N ( y2N , u0 , u2i | D0:2N −1 ) 0 (2i)

d d = W2N ( y02N −d−1 , u02i−1 , u2i , D0:2N −1 )/ P ( D0:2N −1 ) (20)

0 0

0.2

0.4

0.6

0.8

1

Proportion of sorted indices

Fig. 10. Error probability of individual bit-channels when sorted in the ascending order for the deletion channel with fixed number of deletions. The polarization phenomena is immediately noticed.

=



−1 u2N 2i +1

d d P( y02N −d−1 , u02N −1 , D0:2N −1 )/ P ( D0:2N −1 ) (21)

1 · d P( D0:2N −1 )

=





t∈{0,...,d} u2N −1

t d−t P( y02N −d−1 , u02N −1 , D0:N −1 , D N:2N −1 )

2i +1

(22)

=



t∈{0,...,d}

·



−1 u2N 2i +1

=

t d−t P( y02N −d−1 , u02N −1 | D0:N −1 , D N:2N −1 )



t∈{0,...,d}

·{



−1 u2N 2i +1

t d−t d P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 )

t d−t d P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 )

2N −1 2N −1 t d−t P( y0N −t−1 , u0,e ⊕ u0,o | D0:N −1 , D N:2N −1 )

t d−t −d−1 2N −1 , u0,o | D0:N · P( y2N −1 , D N:2N −1 )} N −t

=



t∈{0,...,d}

·{



−1 u2N 2i +1

2N −1 2N −1 t P( y0N −t−1 , u0,e ⊕ u0,o | D0:N −1 )

−d−1 2N −1 −t · P( y2N , u0,o | D dN:2N N −t −1 )}

=



t∈{0,...,d}

·{



u2i+1

(23)

t d−t d P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 )

(24)

t d−t d P( D0:N −1 , D N:2N −1 )/ P ( D0:2N −1 )

(i )

−d−1 2i−1 −t WN ( y2N , u0,o , u2i+1 | D dN:2N N −t −1 )

(i )

2i −1 2i −1 t ⊕ u0,o , u2i ⊕ u2i+1 | D0:N · WN ( y0N −t−1 , u0,e −1 )} . (25) The last equation holds because of (9). The proof of (11) is ignored because (11) can be proven by the same method.

ACKNOWLEDGMENT This work was supported by Chinese Scholarship Council grant 201606020023 during the author’s visit to University of California, San Diego, and National Natural Science Foundation of China under Grant 91438116. R EFERENCES [1] E. Arıkan, “Channel polarization: A method for constructing capacityachieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073, Jul. 2009. [2] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inf. Theory, vol. 61, no. 5, pp. 2213–2226, Apr. 2015. [3] “Ran1 meeting #87,” http://www.3gpp.org/ DynaReport/ TDocExMtg– R1-87–31665.htm. accessed:2017-01-09. [4] M. Mitzenmacher et al., “A survey of results for deletion channels and related synchronization channels,” Probability Surveys, vol. 6, pp. 1–33, 2009. [5] H. Mercier, V. K. Bhargava, and V. Tarokh, “A survey of errorcorrecting codes for channels with symbol synchronization errors,” IEEE Communications Surveys & Tutorials, vol. 12, no. 1, 2010. [6] E. S¸as¸o˘glu, “Polar coding theorems for discrete systems,” Ecole Polytechnique F´ed´erale de Lausanne, 2011. [7] S. N. Diggavi and M. Grossglauser, “On transmission over deletion channels,” in Proceedings of the Annual Allerton Conference on Communication Control and Computing, vol. 39, no. 1. The University; 1998, 2001, pp. 573–582.

[8] E. K. Thomas, V. Y. Tan, A. Vardy, and M. Motani, “Polar coding for the binary erasure channel with deletions,” IEEE Communications Letters, vol. 21, no. 4, pp. 710–713, 2017. [9] S. K. Hanna and S. E. Rouayheb, “Guess & check codes for deletions, insertions, and synchronization,” arXiv preprint arXiv:1705.09569, 2017. [10] R. Mori and T. Tanaka, “Performance of polar codes with the construction using density evolution,” IEEE Commun. Lett., vol. 13, no. 7, pp. 519–521, Jul. 2009. [11] P. Trifonov, “Efficient design and decoding of polar codes,” IEEE Trans. Commun., vol. 60, no. 11, pp. 3221–3227, Nov. 2012. [12] I. Tal and A. Vardy, “How to construct polar codes,” IEEE Transactions on Information Theory, vol. 59, no. 10, pp. 6562–6582, Oct 2013.