Trellis-based feed-forward carrier recovery for ... - OSA Publishing

15 downloads 4566 Views 1MB Size Report
Sep 30, 2016 - Considering hardware limitations of ultra-high data ... E. Ip and J. M. Kahn, “Feedforward carrier recovery for coherent optical .... In coherent receivers, the TCR is applied after the conventional type-II ..... modulation format. Vol.
Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23531

Trellis-based feed-forward carrier recovery for coherent optical systems M AHDI Z AMANI , H OSSEIN N AJAFI , * D EMIN YAO, J EEBAK M ITRA , XUEFENG TANG, CHUANDONG LI, AND ZHUHONG ZHANG Huawei Technologies, Canada Research Center, Ottawa, ON K2K 3J1, Canada

* [email protected]

Abstract: An efficient trellis-based phase noise mitigation algorithm is proposed to highly improve the performance of coherent transmission systems, especially in high order modulation formats. The proposed method targets the coherent optical systems where the performance is limited by various sources of phase noise including laser line-width, fiber non-linearity, and phase noise induced by phase-locked loop. Considering hardware limitations of ultra-high data rate processing in optical systems, a hardware-efficient parallelized and pipelined architecture is utilized. Experimental results in 200 Gb/s DP-16QAM co-propagated with 10-G channels demonstrate significant performance improvement over other existing methods. c 2016 Optical Society of America

OCIS codes: (060.1660) Coherent communications, (060.2330) Fiber optics communications.

References and links 1. D. Marsella, M. Secondini, and E. Forestieri , “Maximum likelihood sequence detection for mitigating nonlinear effects,” J. Lightwave Technol. 23(5), 908–916 (2014). 2. M. Magarini, A. Spalvieri, F. Vacondio, M. Bertolini, M. Pepe, and G. Gavioli, “Empirical modeling and simulation of phase noise in long-haul coherent optical transmission systems,” Opt. Express 19(23), 22455–22461 (2011). 3. Y. Cai, D. G. Foursa, C. R. Davidson, J.-X. Cai, O. Sinkin, M. Nissov, and A. Pilipetskii, “Experimental demonstration of coherent MAP detection for nonlinearity mitigation in long-haul transmissions,” in OFC/NFOEC (2010), paper OTuE1. 4. X. Zhou, J. Yu, M.-F. Huang, Y. Shao, T. Wang, L. Nelson, P. Magill, M. Birk, P. I. Borel, D. W. Peckham, R. Lingle, and B. Zhu, “64-Tb/s, 8 b/s/Hz, PDM-36QAM Transmission Over 320 km Using Both Pre- and Post-Transmission Digital Signal Processing,” J. Lightwave Technol. 31(7), 999–1005 (2013). 5. A. Spalvieri and M. Magarini, “Wiener’s analysis of the discrete-time phase-locked loop with lopp delay,” IEEE Trans. Circuits Sys. 55(6), 596–600 (2008). 6. L. Barletta, M. Magarini, and A. Spalvieri, “Bridging the gap between Kalman filter and Wiener filter in carrier phase tracking,” IEEE Photonics Technol. Lett. 25(11), 1035–1038 (2013). 7. L. Pakala and B. Schmauss, “Extended Kalman fitering for joint mitigation of phase and amplitude noise in coherent QAM systems,” Opt. Express , 24(6), 6391–6401 (2016). 8. E. Ip and J. M. Kahn, “Feedforward carrier recovery for coherent optical communications,” J. Lightwave Technol. 25(9), 2675–2692 (2007). 9. A. J. Viterbi and A. Viterbi, “Non-linear estimation of PSK-modulated carrier phase with application to burst digital transmission,” IEEE Trans. Inform. Theory, IT-29, pp. 543-551, (1983). 10. X. Zhou, “An improved feed-forward carrier recovery algorithm for coherent receivers with M-QAM modulation format,” IEEE Photonics Technol. Lett. 22(14), 2675–2692 (2010). 11. T. Pfau, S. Hoffmann, and R. Noe, “Hardware-efficient coherent digital receiver concept with feedforward carrier recovery for M-QAM constellations,” J. Lightwave Technol. 27(8), 989–999 (2009). 12. X. Zhou and Y. Sun, “Low-complexity, blind phase recovery for coherent receivers using QAM modulation,” in OFC/NFOEC (2011), paper OMJ3. 13. J. H. Ke, K. P. Zhong, Y. Gao, J. C. Cartledge, A. S. Karar, and M. A. Rezania , “Linewidth-tolerant and lowcomplexity two-stage carrier phase estimation for dual-polarization 16-QAM coherent optical fiber communications,” J. Lightwave Technol. 30(24), 3987–3992 (2012). 14. K. P. Zhong, J. H. Ke, Y. Gao, and J. C. Cartledge, “Linewidth-tolerant and low-complexity two-stage carrier phase estimation based on modified QPSK partitioning for dual-polarization 16-QAM systems,” J. Lightwave Technol. 31(1), 50–57 (2013). 15. K. Piyawanno, M. Kuschnerov, B. Spinnler, and B. Lankl, “Nonlinearity mitigation with carrier phase estimation for coherent receivers with higher-order modulation formats,” in Topic Meeting in Lasers and Electro- Optics Society (2009), pp. 426–427. 16. W-R. Peng, Z. Li, F. Zhu, Y. Bai, and T. Tsuritani, “Effectiveness of digital fiber nonlinearity mitigations,” in OFC/NFOEC (2014), SW2C.1.

#270932 Journal © 2016

http://dx.doi.org/10.1364/OE.24.023531 Received 21 Jul 2016; revised 22 Sep 2016; accepted 22 Sep 2016; published 30 Sep 2016

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23532

17. Z. Tao, W. Yan, S. Oda, T. Hoshida, and J. C. Rasmussen, “A simplified model for nonlinear cross-phase modulation in hybrid optical coherent system,” Opt. Express 17(16), 13860–13868 (2009). 18. X. Liang, S. Kumar, J. Shao, M. Malekiha, and D. V. Plant, “Digital compensation of cross-phase modulation distortions using perturbation technique for dispersion-managed fiber-optic systems,” Opt. Express 22(17), 20634– 20645 (2014). 19. S. Lin, D. J. Costello, Error Control Coding (Pearson Prentice Hall, 2004). 20. N. Stojanovic, Y. Huang, F. N. Hauske, Y.-Y. Fang, M. Chen, C. Xie, and Q. Xiong, “MLSE-based nonlinearity mitigation for WDM 112 Gbit/s PDM-QPSK Transmissions with Digital Coherent Receiver,” in OFC/NFOEC (2011), paper OWW6. 21. G. Bosco, I. N. Cano, P. Poggiolini, L. Li, M. Chen, “MLSE-based DQPSK transmission in 43 Gb/s DWDM long-haul dispersion-managed optical systems,” J. Lightwave Technol. 28(10), 1573–1581 (2010). 22. L. Barletta, F. Bergamelli, M. Magarini, N. Carapellese, and A. Spalvieri, “Pilot-aided trellis-based demodulation,” IEEE Photonics Technol. Lett. 25(13), 1234–1237 (2013). 23. T. Fehenberger, M. P. Yankov, L. Barletta, and N. Hanik, “Compensation of XPM interference by blind tracking of the nonlinear phase in WDM systems with QAM input,” in Proceedings of European Conference on Optical Communications (ECOC) (2015), pp. 1–3. 24. M. P. Yankov, T. Fehenberger, L. Barletta, and N. Hanik, “Low-complexity tracking of laser and nonlinear phase noise in WDM optical fiber systems,” J. Lightwave Technol. 33(23), 4975–4984 (2015). 25. A. J. Viterbi, “Error bounds for convolution codes and an asymptotically optimum decoding algorithm,” IEEE Trans. Inform. Theory, IT-13, 260–269 (1967).

1.

Introduction

Growing demand of higher spectral efficiency in coherent optical transceivers has motivated the industry to utilize high order quadrature amplitude modulation (QAM) formats such as 16QAM and 64QAM. Consequently, the phase noise impacts can be exacerbated specially in dispersion managed links with hybrid transmission where fiber nonlinearity appears primarily as phase noise [1–4]. Feedback carrier recovery (FBCR) algorithms comprising of a decision-directed digital phaselocked loop (DD-PLL) are common in coherent optical receivers to correct the symbol phase error. FBCRs may be implemented based on Wiener filtering [5, 6], Kalman filtering [7], or other filtering methods. However, several sources of phase noise such as large laser linewidth (LW) and non-linear phase shifts are not completely mitigated in DD-PLL with a bandwidth in the order of tens to hundreds of megahertz. To improve the performance of FBCR, several feed-forward carrier recovery (FFCR) algorithms such as NDA [8, 9], maximum likelihood (ML) based FFCR [10], BPS [11, 12], two-stage carrier recovery (CR) [13, 14], and crosscoupling phase estimation [15] have been widely proposed. However, in phase noise sensitive environments, the phase noise can still be a serious impediment. Most published work on fiber nonlinearity compensation has been concerned with intra-channel nonlinearity mitigation [16], or the information from neighbouring channels is required [17, 18]. It is thus of great practical interest to efficiently mitigate wideband nonlinear phase noise. Trellis structure based on a finite state-machine is a well-known concept in order to address the memory in a sequence of symbols [19]. Trellis based algorithms have been widely proposed for different problems such as decoding of convolutional codes, maximum likelihood sequence estimation (MLSE) for inter-symbol interference (ISI) channels, trellis coded modulation (TCM), trellis space-time codes, trellis shaping, etc. There are several well-known methods to perform decoding or detection over a trellis such as Viterbi algorithm (VA), BCJR algorithm, etc. [19]. In this paper, we propose a hardware-efficient trellis-based carrier recovery (TCR) to improve the performance and reach of ultra high data-rate coherent optical single-carrier (COSC). The proposed TCR targets the peculiarity of residual phase noise (RPN) in such systems. Experimental results, in 200 Gb/s DP-16QAM co-propagated with 10-G channels, demonstrate significant performance improvement over other existing methods. The proposed TCR offers robustness and accuracy for the phase noise correction in a hardware-efficient implementation design. To the best of authors knowledge, this is the first time that a trellis-based carrier recovery has been

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23533

proposed and implemented in coherent optical systems. It is notable that the presented algorithm is designed for the phase of the modulated sequence, and it is different from other MLSE-based nonlinearity mitigation methods that are designed on nonlinear ISI in the modulated sequence (see [1, 20–22] and references therein). The rest of this paper is organized as follows: Section 2 presents the preliminaries as a prelude to the paper. Section 3 delineates the proposed TCR algorithm. Then, experimental results are provided in section 4. Finally, section 5 concludes the paper. 2.

Spectral properties of phase noise

In coherent optical systems, laser LW with Lorentzian lineshape and fiber non-linearity can generate phase noise that deteriorates the system performance. After perfect linear equalization, a simple and ISI-free equivalent discrete-time model of symbol-spaced samples of the sequence at the input of FBCR module is given by r[`] = s[`]e jφ[`] + w[`],

(1)

where j is the imaginary unit, ` represents time index, {s[`]} is the transmitted constellation symbols, and {w[`]} is a zero-mean complex additive white Gaussian noise (AWGN) with 2 . Process Φ represents the unknown time-variying phase noise due to the laser LW variance σw and non-linear phase shift, which is modeled as a random walk [2, 23] φ[`] = φ[` − 1] + δ[`],

(2)

where {δ[`]} follows an independent and identically distributed (i.i.d) Gaussian distribution with zero mean and variance σ∆2 . For small σ∆ , it is well-established that spectrum of the complex exponential e jφ[`] , namely multiplicative noise, is the Lorentzian function [2] which can be written as 4σ∆2 T , (3) Lφ ( f ) = 2 σ∆ + 16π 2 f 2T 2 where T is the symbol interval and f is the frequency. The full-width at half-maximum (FWHM) of Lorentzian function [6] is given by f FWHM =

σ∆2 . 2πT

(4)

Considering the fact that basic sources of phase noise (from the transmitter, receiver, and the fiber non-linearity) are well-modelled with Eq. (2) as shown in [2, 23, 24], a trellis structure is developed in the next section based on a single memory finite-state machine to mitigate the RPN. 3.

The proposed TCR

In coherent receivers, the TCR is applied after the conventional type-II FBCR module, and it works on symbol-spaced equalized sequences, where two independent TCR modules, one per polarization, are employed (see Fig. 1). The conventional type-II FBCR is a 2nd order DD-PLL employed with delays of several symbols due to hardware restrictions. The TCR processing includes three units: branch metric calculation unit (BMCU), addcompare-select unit (ACSU), and trace-back unit (TBU). Branch metrics of the trellis are first computed in BMCU, then ACSU selects survived branches at each state and calculates accumulated path metrics (APM), and finally the RPN is estimated in TBU. A simple schematic diagram of the digital signal processing (DSP) components at the receiver including the proposed TCR is depicted in Fig. 1. Indeed, in the presented structure, the TCR operates on the RPN at the output of FBCR, and it does not compensate cycle slips due to high phase jumps of κπ 4 .

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23534

Phase De-rotator

Delay

Input

Phase Rotators

FDE

Slicer

H

X Adaptive MIMO-FIR (TDE)

FDE

V

Y

BMCU

Coarse Carrier Recovery (FBCR)

ACSU

Output

TBU

TCR FEC Decoder

TCR

Fig. 1. A simple schematic diagram of Rx-DSP including the proposed TCR

In this section, first the trellis structure of TCR is presented. Then, the metric calculation is elaborated and various enhancement according to high data-rate COSC requirements are proposed. Afterwards, details of TBU are provided and finally, the performance is discussed. 3.1.

Trellis structure

The trellis is developed here based on a single memory finite-state machine as discussed in Section 2. Since phase noise is a continuous random variable, to estimate the RPN, it is quantized to M discrete values in order to be processed digitally in the TCR module with a finite amount of trellis states. Each state in the trellis represents a test phase in a time slot, and each branch represents phase fluctuations. We assume that at each time slot, phase stays at the same state as previous time slot or jumps to either the next or the previous state. Thus, at each time slot, each state is connected to three states of the next time slot. In order to realize the TCR processing in high data-rate COSC, the DSP uses a parallel and pipeline structure, where each parallel TCR processes a sub-block of symbols. The discontinuity of data caused by dividing the symbol sequence into parallel sub-blocks is addressed with over-lap sub-block processing. It is remarkable that the TCR is basically designed to be modulation-format-independent (only the slicer needs to be updated for different modulation formats), that is a big advantage for future elastic optical networks. Consider the example illustrated in Fig. 2 with K = 3, M = 12, and L = 16 where K, M, and L are the number of branches in each state, total number of states in the trellis, and number of stages in the trellis for processing symbols, respectively. In other words, the implemented trellis in the TCR module comprises twelve states where in each trellis stage, the phase of the symbol is assumed to have one of three phase jump values. Figure 2 also shows that the TCR module uses several symbols for initiation of the trellis, and/or a number of symbols for termination of the trellis in the parallel structure (four symbols in the example of Fig. 2). Considering the trade-off between complexity and performance, 8 ∼ 16 states is an appropriate choice for 64QAM or lower order of modulations (we verified that through experiments in COSC). Since a coarse phase correction in FBCR is employed here, the required number of test phases for TCR is lower than the ones for BPS in [11].

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23535

M states Initiation symbols

processing symbols

Termination symbols

Fig. 2. A trellis structure example for TCR

3.2.

Metric calculation

In the trellis processing using the Viterbi algorithm (VA) [19, 25], branch metrics are calculated in BMCU. For each state in the trellis, the TCR input (which is the output of FBCR in our model) is rotated according to the test phase associated with that state. Considering Gaussian conditional joint probability density function as  2  2 r [`]e j (φ[` −1]+δ[`]) − s[`] |    − δ[`]2 − | 1 2σ 2w  , f R ,∆|Φ (r[`], δ[`]|φ[` − 1]) = E s  e 2σ ∆ e (5)  2πσ∆ σw  ∆

where δ[`] = φ[`] − φ[` − 1] and E s [·] denotes the expected operation over s, the maximumlikelihood-based branch metric of the k th branch in state m at time slot ` is given by

Γ[k , m, `] = − ln

Q X

e

− | r [`]e

jφ m − s[i]| 2 2σ 2w

i=1

|

{z Term 1

|φm − φk →m | 2 , 2σ∆2 {z } } | Term 2 k = 1, · · · , K, m = 1, · · · , M, ` = 1, · · · , L, +

(6)

where r[`] is the equalized symbol at time slot `, φm is the corresponding phase test of state m, and φk →m is the corresponding phase test of the origin state of branch k. Denote by Q the total number of constellation points in the underlying modulation format. As mentioned before, K, M, and L are number of branches in each state, total number of states in the trellis, and number of stages in the trellis, respectively. Term 1 in the branch metric definition in Eq. (6) is referred to as the distance metric d[m, `], while Term 2 represents the probability of phase jump to k th branch. Note that Term 2 is independent of the received signal and hence, it can be stored as a small look-up table (LUT) for implementation. To reduce the BMCU complexity, one may only consider the dominant term in the log-normal of Term 1 in Eq. (6). As a result, the rotated version of the received signal is mapped to the

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23536

closest constellation point in the engaged constellation. Thereafter, a branch metric is calculated based on a normalized Euclidean distance between the rotated version of the TCR input and the mapped constellation point, as well as the probability of the phase jump. Therefore, Eq. (6) is simplified to |φm − φk →m | 2 Γ[k , m, `] ≈ |r[`]e jφ m − sˆ[`]| 2 + , | {z } σ2 | {z } Term 1 Term 2 k = 1, · · · , K, m = 1, · · · , M, ` = 1, · · · , L,

(7)

where sˆ[`] is the hard decision for r[`]e jφ m , and σ = σσw∆ . Equation (7) is developed for phase noise with Lorentzian spectrum. However, in reality, the phase noise does not follow the exact Lorentzian spectrum due to different sources of noise in the system. The undesired phase jumps in conjunction with the decision error in the slicer induced by additive noise may lead to some erroneous decisions in the TCR. To address this issue, the normalized Euclidean distance is averaged across several neighbouring symbols, in order to smooth the branch metric values and to make the algorithm more resilient to unexpected errors. For instance, the branch metric is modified by employing the distance metrics of a previous symbol and a next symbol in each trellis stage. Accordingly, an enhanced branch metric can be obtained by modifying the distance metric d[m, `] in Term 1 in the right-hand-side of Eq. (7) to be a distance metric averaged across d[m, ` − 1], d[m, `], and d[m, ` + 1]. The trellis processing follows the standard VA [19, 25]. The ACSU in each trellis stage adds the computed branch metrics and the APM for each state, in order to compare and select the path with smallest metric for each state and update the APMs for the next stage. At the last stage, the most probable path with minimum path metric is selected. This surviving path is then used to trace back the most likely phase error sequence and compensate the received modulated symbols accordingly. ∆

3.3.

Trace-back in trellis

According to the VA, the trellis is navigated from a first trellis stage to the L th trellis stage, where the metrics are accumulated stage by stage. Finally, the survived path (the most probable path with the least metric) is used for estimating the most likely RPN, and the received modulated signal is compensated based upon it. A limited trace-back length is imposed by resource and throughput requirements in parallel structure. To increase the effective trace-back length and enhance the performance, a supersymbol-based TCR processing is proposed where a super-symbol is formed by a group of N consecutive symbols. In this case, the branch metric is obtained by replacing the distance metric d[m, `] in Eq. (7) with D[m, `], as an average metric of the N symbols constructing the super-symbol, e.g., d[m, i], · · · , d[m, i + N − 1]. Thereafter, the trellis processing is performed for super-symbols as before, and at the end, a common estimated RPN is used to compensate each of the consecutive symbols constructing the super-symbol. Additionally, the super-symbol structure reduces hardware resources as the number of parallel processors required to process a given frame is decreased. In Fig. 3, a super-symbol-based TCR is shown to process received 4L symbols over a trellis of L stages. Figure 4 is a schematic diagram of the enhanced branch metric calculation by implementing the aforementioned branch metric smoothing and the above super-symbol based TCR processing. In the BMCU as shown in Fig. 4, distance metrics d[m, i], d[m, i + 1], d[m, i + 2], and d[m, i + 3] for each of four consecutive symbols s[i], s[i + 1], s[i + 2], and s[i + 3] are obtained using the described phase rotation and slicing in Eq. (7). The obtained distance metrics d[m, i], d[m, i + 1], d[m, i + 2], and d[m, i + 3] are then averaged to calculate a distance metric D[m, `] for a super-

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23537

ACS

ACS

ACS

ACS

ACS

ACS

ACS

ACS

ACS

ACS

ACS

ACS

Trellis Processing Survived and competitive paths selection S1 s1 s2 s3 s4

S2

S4

Super-symbols

s13 s14 s15 s16

TCR output symbols

S3

SL-3

SL-2 SL-1 s4L-3

SL s4L

Fig. 3. Super-symbol-based TCR processing

symbol S[`] corresponding to the four consecutive symbols s[i], s[i + 1], s[i + 2], and s[i + 3]. In Fig. 4, for a trellis stage `, the distance metric D[m, `] is calculated based on another average of distance metrics D[m, ` − 1], D[m, `], and D[m, ` + 1] of three adjacent super-symbols S[` − 1], S[`], and S[` + 1], according to metric averaging described in section 3.2. That is, in this case, the enhanced branch metric calculation implements a super-symbol based TCR processing with N = 4, and a branch metric smoothing based on three neighboring super-symbols. According to Fig. 4, in each stage of the trellis, twelve symbols are involved in one branch metric calculation. Furthermore, the TCR performance can be improved by exploiting the survived path together with the competitive path(s) (the second or higher order most probable path(s)), as it is shown in Fig. 3. Since the trellis represents the RPN, it is expected that these paths be close to each other. Therefore, if an incorrect path is survived due to reasons such as, random phase noise jumps, short trace-back length, open-end trellis, etc., the competitive path(s) can be used to improve the phase estimation accuracy. For example, an improved TCR can be realized with a weighted average phase correction from the survived path, the first and the second competitive paths. 3.4.

Performance

Here, the performance of the proposed TCR is investigated through simulation for 200 Gb/s DP16QAM and 400 Gb/s DP-64QAM back-to-back. The components and fiber are well-modeled in the simulation platform. The platform simulates the entire Tx and Rx chains following real hardware design in a FIFO structure. The required OSNR (ROSNR) at 3% raw BER versus laser LW is depicted in Fig. 5 for different algorithms. Here, the performances of the proposed TCR, BPS, and ML-FFCR algorithms are compared. For these simulations, a 12-state TCR module is formed with a 20-stage trellis in order to process a sub-block of 16 received super-symbols with 4 over-lapped super-symbols from the previous TCR module. Figure 5(a) presents ROSNR versus laser LW for 200 Gb/s DP-16QAM. As expected, in back-to-back case with low laser LW, all three algorithms have similar performance. However, as Fig. 5 shows, the proposed TCR provides significant performance improvement in the presence of large laser LW. ROSNR versus laser LW for 400 Gb/s DP-64QAM is depicted in Fig. 5(b). Since 64QAM is highly sensitive to phase noise, even at 300 KHz laser LW, TCR provides around 2.5 dB ROSNR gain compared to ML-FFCR. Moreover, by utilizing TCR, the ROSNR penalty due to phase noise is less than 0.5 dB at 300 KHz laser LW.

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23538

d1 D1

ejφm ri

x

di-1 Slicer +

|.|

di di+1 di+2 di+3

Dl-2 Dl-1

Trellis stage l-1

Dl

Trellis stage l

Dl+1

Trellis stage l+1

Dl+2

di+4 DL

d4L

Fig. 4. A schematic diagram of super-symbol metric calculation

4.

Experimental results

The performance of the proposed TCR is investigated through lab experiments, and compared with the performance of ML-FFCR [10] and BPS [11,12]. Figure 6(a) illustrates the schematic of the experimental setup. The results were obtained for a coherent 200 Gb/s channel co-propagated with 10-G channels with 2 ∼ 4 spans, where each span consisted of 80-km standard single mode fiber (SSMF), and a dual-stage erbium doped fiber amplifier (EDFA). Each SSMF spool was followed by a dispersion-compensating fiber (DCF). As depicted in Fig. 6(b), an 85 GS/s 4-channel DAC was used to generate the 200 Gb/s DP-16QAM channel at a wavelength of ∼ 1550 nm. At the output of the link, the 200 Gb/s DP-16QAM channel was filtered by a 50-GHz wavelength-selective switch (WSS). The output signal was detected by a coherent receiver and sampled by a real-time sampling scope at a sampling rate of 80 GS/s. The baud-rate was 34.4 GBd/s associated with 25% overhead SD-FEC with raw BER threshold of 3%. A root-square raised cosine filter with a roll-off factor of 0.2 was used to confine the signal spectrum. Fifteen 10 Gb/s on-off keying (OOK) transceiver line cards were used to co-propagate fifteen 10-G channels with the 200 Gb/s channel, and the system was operating at the optimum launch power. The channel spacing between the 200 Gb/s DP-16QAM channel and fifteen OOK channels was 500 GHz, and the channel spacing between OOK channels was 100 GHz. The channel spacing in this experiment was optimized for real situation. In fact, by decreasing the channel spacing between the 16QAM and OOK channels, the coherent channel experiences a higher inter-channel nonlinear phase distortion and hence, higher gains are expected from the proposed TCR compared to its counterparts. In the offline DSP, we processed 10M-bit data for each measurement point. In parallelized and pipelined DSP, following the MIMO-FIR adaptive LMS equalizer, CR was employed as one-sample-per-symbol optimized type-II FBCR in concert with three different algorithms: in the first scenario, the proposed TCR with 12 states (covering a range of ±20◦ RPN) and 16super-symbol sub-block-length (triangle-marked red curve); in the second scenario, BPS [11, 12] with 16 phase tests and 31-symbol window-length (diamond-marked black curve); and in the third scenario, ML-FFCR [10] with 31-symbol window-length (square-marked blue curve).

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23539

22.5

ROSNR (dB)

21.5

ML−FFCR BPS TCR

20.5

19.5

18.5

17.5 300

500

1000 Laser LW (KHz)

2000

(a) 200 Gb/s DP-16QAM

29 28.5

ML−FFCR BPS TCR

ROSNR (dB)

28 27.5 27 26.5 26 25.5 100

150

200

Laser LW (KHz)

250

300

(b) 400 Gb/s DP-64QAM

Fig. 5. Simulation results for ROSNR versus laser LW in back-to-back

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23540

15 x 10Gb/s OOK

DeMUX

4X

MUX IMDD Tx IMDD Tx IMDD Tx IMDD Tx IMDD Tx IMDD Tx IMDD Tx IMDD Tx IMDD Tx IMDD 10-G Tx Tx

EDFA VOA

80-km SSMF

EDFA

DCF

EDFA Sig

CoTx

ICR

Real-time Sampling Oscilloscope

200Gb/s DP-16QAM LO

(a) System setup

Coherent Tx Bit stream bitMapping TxDSP 85GS/s 4-CH DAC

RF Driver XI

XQ

YI

YQ

LN-QPMZ

(b) Coherent Tx block diagram

Fig. 6. System setup of 200 Gb/s DP-16QAM co-propagation with 10 Gb/s OOK channels; DCF: dispersion compensation fiber, EDFA: erbium doped fiber amplifier, SSMF: standard single mode fiber, MUX: multiplexer, DeMUX: demultiplexer, CoTx: coherent transmitter.

Figure 7 presents transmission experimental results for raw BER versus OSNR (dB/0.1nm). As Fig. 7 shows, the proposed TCR outperforms both ML-FFCR and BPS algorithms. It is noteworthy that the parameters of both ML-FFCR and BPS modules were set to optimize the performance in each scenario. No further improvement was observed by cascading MLFFCR and BPS or by applying two-stage CR methods of [13, 14]. After 2 spans, the TCR provides 0.3 dB and 0.75 dB OSNR gain over BPS and ML-FFCR, respectively, at 3% raw BER. After 3 spans, the TCR gain over BPS and ML-FFCR at 3% raw BER is 0.85 dB and 3.2 dB, respectively. Moreover, the TCR outperforms BPS by a noticeable gain after 4 spans at 3% raw BER, while ML-FFCR cannot reach the FEC threshold due to the high error floor. This confirms the importance of the proposed TCR in high phase noise regimes where the system with a standalone FBCR or even a conjunction of FBCR and ML-FFCR would end up in early error floor. It should also be noted that for transmission of 16QAM or lower order modulation formats over uncompensated links, the benefits of the proposed TCR over its counterparts are limited since we are dealing with lower amount of non-linear phase shift.

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23541

−1

BER

10

−2

10

ML−FFCR BPS TCR −3 10 17 18 19 20 21 22 23 24 25 26 27 OSNR (dB) (a) 2 spans −1

BER

10

ML−FFCR BPS TCR −2 10 17 18 19 20 21 22 OSNR (dB)

23

24

25

(b) 3 spans −1

BER

10

ML−FFCR BPS TCR −2 10 18 19 20

21 22 23 OSNR (dB)

24

25

(c) 4 spans

Fig. 7. Experimental results for raw BER versus OSNR in 200 Gb/s DP-16QAM copropagated with 10-G channels

Vol. 24, No. 20 | 03 Oct 2016 | OPTICS EXPRESS 23542

5.

Conclusion

In coherent optical transmission, especially in hybrid co-propagation with 10-G channels, the phase shifts due to fiber non-linearity has strong impact on the system performance and reach. Especially, cross-phase modulation compensation is a complex procedure with extensive processing requirements, and also strongly depends on the system and link parameters and neighboring channels. In this paper, we proposed a trellis-based carrier phase correction algorithm for coherent optical transmission systems. The proposed block-wise algorithm operates at symbol rate and enables hardware-efficient parallel processing implementation for real-time modem. Furthermore, we carried out experiments with 200 Gb/s DP-16QAM co-propagated with 10-G channels through 4 × 80 km SSMF and demonstrated that the proposed carrier phase recovery outperforms previous algorithms. Several experiments demonstrated the robustness and superior performance of the proposed algorithm. We expect that the performance gain of the proposed algorithm compared with the conventional algorithms raises rapidly with increasing order of the modulation format.