Recent Research SummaryPhysics-Based Deep Learning for Fiber-Optic Communications
Fiber-optic communication links carry virtually all intercontinental data traffic and are often referred to as the Internet backbone. The signal evolution in an optical fiber is implicitly described by the nonlinear Schrödinger equation, which is schematically illustrated in the above figure. In this research, we explore the close connection between the propagation dynamics in optical fiber and deep learning. Our main observation is that the mathematical structure that results from applying spatial discretization methods (in particular the so-called split-step method) to the nonlinear Schrödinger equation is very similar to the structure of a conventional deep neural network with multiple layers. In a nutshell, the main idea is to obtain a machine-learning model by properly parameterizing the split-step method and viewing the linear steps as general linear functions, similar to the weight matrices in neural network. Compared to applying large neural networks (e.g., for signal equalization purposes), such physics-based machine-learning models are appealing because they incorporate important domain knowledge about the propagation dynamics directly into the model structure and obey well-established physical principles like symmetry and locality in order to reduce the number of trainable parameters. Moreover, these models also allow us to examine and interpret the learned solutions in order to understand why they perform well.
Capacity via SymmetryEarly work on algebraic codes often paid special attention to the symmetry of the code. With the rise of sparse random codes and iterative decoding, the idea that symmetry alone could lead to good performance started to seem rather implausible. However, in a breakthrough, we have shown that symmetry is actually sufficient to achieve capacity. Specifically, we proved that sequences of error-correcting codes with doubly-transitive permutation groups achieve capacity on erasure channels under symbol-wise maximum a posteriori (MAP) decoding. While the proof only holds for erasure channels, we believe the same result holds for general symmetric channels. This discovery was made independently by two groups: {Santhosh Kumar and myself} and {Shrinivas Kudekar, Marco Mondelli, Eren Sasoglu, and Ruediger Urbanke}. This result solves the open problem of whether or not Reed-Muller codes achieve capacity for some rate bounded away from 0 and 1. This was a posted open problem from the 2015 Simons Institute program on Information Theory and the subject of a 1994 conjecture by Dumer and Farrell. It has also been discussed by many others including Arikan, Abbe, Shpilka, and Widgerson. The same result also holds for primitive narrow-sense BCH codes and other classes of highly symmetric algebraic codes. Finally, the result for erasure channels has been extended to the quantum erasure channel and sequences of cyclic codes whose blocklengths have the proper numerology (e.g., increasing prime lengths suffice). The conference paper was selected as Best Paper at STOC 2016 and the journal paper received the 2021 Information Theory Society Paper Award.
Simple Proofs of Threshold Saturation for Spatially-Coupled SystemsFrom 2011-2012, Kudekar, Richardson, and Urbanke showed that low-density parity-check (LDPC) convolutional codes (or spatially-coupled codes) can approach capacity on the binary erasure channel (BEC) and binary-input memoryless symmetric channels. The mechanism behind this spectacular performance is the threshold saturation phenomenon, which is characterized by the belief-propagation noise threshold of the spatially-coupled ensemble increasing to an intrinsic noise threshold (e.g., the MAP threhold) defined by the uncoupled system. While these results are quite impressive, the proof technique in the above work leads to long and complicated proofs. Our main result is a simple proof technique that establishes threshold saturation for a general class of spatially-coupled systems. It is based on potential functions and both simplifies and generalizes most previous results in this area.
Universality of Spatially-Coupled Codes for Multiuser ProblemsA noisy Slepian-Wolf problem is considered where two correlated sources are separately encoded and transmitted over two independent binary memoryless symmetric channels. Each channel capacity is assumed to be characterized by a single parameter which is not known at the transmitter. The receiver has knowledge of both the source correlation and the channel parameters. We call a system universal if it retains near-capacity performance without channel knowledge at the transmitter. Kudekar et al. recently showed that terminated low-density parity-check (LDPC) convolutional codes (a.k.a. spatially-coupled LDPC ensembles) can have belief-propagation thresholds that approach their maximum a-posteriori thresholds. This was proven for binary erasure channels and shown empirically for binary memoryless symmetric channels. They also conjectured that the principle of spatial coupling is very general and the phenomenon of threshold saturation applies to a very broad class of graphical models. In this research, we derive an area theorem for the joint decoder and empirically show that threshold saturation occurs for this problem. As a result, we demonstrate near-universal performance for this problem using the proposed spatially-coupled coding system. A similar result has also been obtained for the 2-user multiple-access channel.
Combining Queueing and Coding for Finite-State ChannelsThis research considers the relationship between code-rate selection and queueing performance for communication systems subject to time-varying channel conditions. While error-correcting codes offer protection against channel uncertainties, there exists a natural tradeoff between the enhanced protection of low-rate codes and the rate penalty imposed by additional redundancy. In the limiting regime where codewords are asymptotically long, this tradeoff is well-understood and characterized by the Shannon capacity. However, for delay-sensitive communication systems and finite block lengths, a complete characterization of this tradeoff is not fully developed. This research offers a new perspective on the queueing performance of communication systems with finite block lengths operating over correlated erasure channels. A rigorous framework that links code rate to overall system performance for random codes is presented and guidelines are identified for code-rate selection in delay-sensitive systems.
Spatially-Coupled Codes for ISI ChannelsRecently, it has been observed that terminated low-density-parity-check (LDPC) convolutional codes (or spatially-coupled codes) appear to approach the capacity universally across the class of binary memoryless channels. This is facilitated by the threshold saturation effect whereby the belief-propagation (BP) threshold of the spatially-coupled ensemble is boosted to the maximum a-posteriori (MAP) threshold of the underlying constituent ensemble. We consider the universality of spatially-coupled codes over intersymbol-interference (ISI) channels under joint iterative decoding. More specifically, we empirically show that threshold saturation also occurs in this case. This can be observed by identifying the EXIT/GEXIT curves that naturally obey the general area theorem. From these curves, the corresponding MAP and the BP thresholds are then numerically obtained and threshold saturation is observed.
Linear Programming Decoding of LDPC codes on Finite-State ChannelsThis research considers the joint-decoding problem for finite-state channels (FSCs) and low-density parity-check (LDPC) codes. First, the linear-programming (LP) decoder for binary linear codes is extended to joint-decoding of binary-input FSCs. In particular, we provide a rigorous definition of LP joint-decoding pseudo-codewords that enables evaluation of the pairwise error probability between codewords and pseudocodewords in AWGN. This leads naturally to a provable upper bound on decoder failure probability. If the channel is a finite-state intersymbol interference channel, then the joint LP decoder also has the maximum-likelihood (ML) certificate property and all integer-valued solutions are codewords. After deriving these results, we discovered some elements were equivalent to earlier work by Flanagan on linear-programming receivers. In the second part, we develop an efficient iterative solver for the joint LP decoder discussed in the first part. Next, we extend the approach of iterative approximate LP decoding, proposed by Vontobel and Koetter and analyzed by Burshtein, to this problem. By taking advantage of the dual-domain structure of the joint-decoding LP, we obtain a convergent iterative algorithm for joint LP decoding whose structure is similar to BCJR-based turbo equalization (TE). The result is a joint iterative decoder whose per-iteration complexity is similar to that of TE but whose performance is similar to that of joint LP decoding. The main advantage of this decoder is that it appears to provide the predictability of joint LP decoding and superior performance with the computational complexity of TE. One expected application is coding for magnetic storage where the required block-error rate is extremely low and system performance is difficult to verify by simulation.
Soft-Decision Decoding via Multiple Decoding Attempts for Reed-Solomon CodesOne popular approach to soft-decision decoding of Reed-Solomon (RS) codes is based on using multiple trials of a simple RS decoding algorithm in combination with erasing or flipping a set of symbols or bits in each trial. We consider a novel framework based on rate-distortion (RD) theory to analyze these multiple-decoding algorithms. By defining an appropriate distortion measure between an error pattern and an erasure pattern, the successful decoding condition, for a single errors-and-erasures decoding trial, becomes equivalent to distortion being less than a fixed threshold. Finding the best set of erasure patterns also turns into a covering problem that can be solved asymptotically by RD theory. Thus, the proposed approach can be used to understand the asymptotic performance-versus-complexity tradeoff of multiple errors-and-erasures decoding of RS codes. This initial result is also extended a few directions. The rate-distortion exponent (RDE) is computed to give more precise results for moderate blocklengths. Multiple trials of algebraic soft-decision (ASD) decoding are analyzed using this framework. Analytical and numerical computations of the RD and RDE functions are also presented. Finally, simulation results show that sets of erasure patterns designed using the proposed methods outperform other algorithms with the same number of decoding trials.
Compressed Sensing and CodingThis work began with the observation that compressed sensing can be seen as real-number extension of syndrome source-coding. Since coding over the real numbers is akin to using a very large alphabet, we applied coding techniques designed for large alphabets to compressed sensing. In particular, we used low-density measurement matrices with verification decoding and were able to derive rigorous sparsity thresholds based on density evolution.
Achieving Capacity with Bounded ComplexityThis work provided the first capacity-achieving code ensembles for the binary erasure channel whose iterative decoding complexity (per information bit) is bounded as the gap to capacity vanishes. First, we designed irregular repeat accumulate (IRA) codes where all systematic bits were punctured. Later, we found a systematic construction based on accumulate repeat accumulate (ARA) codes.
On the Capacity of Finite-State ChannelsThis work is based on a simple Monte Carlo algorithm that estimates information rates of finite-state intersymbol-interference channels. Before this, the best approach was very complex and yielded only loose bounds. This algorithm (which was also discovered independently by Arnold and Loeliger) stimulated enough interest in this area that, a few years later, these capacities can now be computed to almost any desired accuracy. This work was later extended to provably design irregular LDPC codes that achieve the i.i.d. capacity of some erasure channels with memory.
|