Recent Research Summary

Physics-Based Deep Learning for Fiber-Optic Communications

optical fiber


Fiber-optic communication links carry virtually all intercontinental data traffic and are often referred to as the Internet backbone. The signal evolution in an optical fiber is implicitly described by the nonlinear Schrödinger equation, which is schematically illustrated in the above figure.

In this research, we explore the close connection between the propagation dynamics in optical fiber and deep learning. Our main observation is that the mathematical structure that results from applying spatial discretization methods (in particular the so-called split-step method) to the nonlinear Schrödinger equation is very similar to the structure of a conventional deep neural network with multiple layers. In a nutshell, the main idea is to obtain a machine-learning model by properly parameterizing the split-step method and viewing the linear steps as general linear functions, similar to the weight matrices in neural network. Compared to applying large neural networks (e.g., for signal equalization purposes), such physics-based machine-learning models are appealing because they incorporate important domain knowledge about the propagation dynamics directly into the model structure and obey well-established physical principles like symmetry and locality in order to reduce the number of trainable parameters. Moreover, these models also allow us to examine and interpret the learned solutions in order to understand why they perform well.


Capacity via Symmetry

EXIT functions of rate-1/2 Reed-Muller codes

Early work on algebraic codes often paid special attention to the symmetry of the code. With the rise of sparse random codes and iterative decoding, the idea that symmetry alone could lead to good performance started to seem rather implausible. However, in a breakthrough, we have shown that symmetry is actually sufficient to achieve capacity. Specifically, we proved that sequences of error-correcting codes with doubly-transitive permutation groups achieve capacity on erasure channels under symbol-wise maximum a posteriori (MAP) decoding. While the proof only holds for erasure channels, we believe the same result holds for general symmetric channels. This discovery was made independently by two groups: {Santhosh Kumar and myself} and {Shrinivas Kudekar, Marco Mondelli, Eren Sasoglu, and Ruediger Urbanke}.

This result solves the open problem of whether or not Reed-Muller codes achieve capacity for some rate bounded away from 0 and 1. This was a posted open problem from the 2015 Simons Institute program on Information Theory and the subject of a 1994 conjecture by Dumer and Farrell. It has also been discussed by many others including Arikan, Abbe, Shpilka, and Widgerson.

The same result also holds for primitive narrow-sense BCH codes and other classes of highly symmetric algebraic codes. Finally, the result for erasure channels has been extended to the quantum erasure channel and sequences of cyclic codes whose blocklengths have the proper numerology (e.g., increasing prime lengths suffice).

The conference paper was selected as Best Paper at STOC 2016 and the journal paper received the 2021 Information Theory Society Paper Award.


  • S. Kudekar, S. Kumar, M. Mondelli, , H. D. Pfister, E. Sasoglu, and R. Urbanke, “Reed-Muller codes achieve capacity on erasure channels,” accepted to IEEE Trans. Inform. Theory, 2016. arXiv

  • S. Kudekar, S. Kumar, M. Mondelli, , H. D. Pfister, E. Sasoglu, and R. Urbanke, “Reed-Muller codes achieve capacity on erasure channels,” in Proc. of the Annual ACM Symp. on Theory of Comp., 2016. paper slides

  • S. Kudekar, S. Kumar, M. Mondelli, H. D. Pfister, and R. L. Urbanke, “Comparing the bit-MAP and block-MAP decoding thresholds of Reed-Muller codes on BMS channels,” in Proc. IEEE Int. Symp. Inform. Theory, (Barcelona, Spain), pp. 1755-1759, 2016. arXiv

  • S. Kumar, R. Calderbank, and H. D. Pfister, “Reed-Muller codes achieve capacity on the quantum erasure channel,” in Proc. IEEE Int. Symp. Inform. Theory, (Barcelona, Spain), pp. 1750-1754, 2016.

  • S. Kumar, R. Calderbank, and H. D. Pfister, “Beyond double transitivity: Capacity-achieving cyclic codes on erasure channels,” in Proc. IEEE Inform. Theory Workshop, pp. 241-245, Sept 2016.

Simple Proofs of Threshold Saturation for Spatially-Coupled Systems

Potential function for (3,6) LDPC code

From 2011-2012, Kudekar, Richardson, and Urbanke showed that low-density parity-check (LDPC) convolutional codes (or spatially-coupled codes) can approach capacity on the binary erasure channel (BEC) and binary-input memoryless symmetric channels. The mechanism behind this spectacular performance is the threshold saturation phenomenon, which is characterized by the belief-propagation noise threshold of the spatially-coupled ensemble increasing to an intrinsic noise threshold (e.g., the MAP threhold) defined by the uncoupled system. While these results are quite impressive, the proof technique in the above work leads to long and complicated proofs.

Our main result is a simple proof technique that establishes threshold saturation for a general class of spatially-coupled systems. It is based on potential functions and both simplifies and generalizes most previous results in this area.


  • S. Kumar, A. J. Young, N. Macris, and H. D. Pfister, “Threshold saturation for spatially- coupled LDPC and LDGM codes on BMS channels,” submitted to IEEE Trans. on Inform. Theory, Nov. 2013 arxiv

  • A. Yedla, Y.-Y. Jian, P. S. Nguyen, and H. D. Pfister, “A simple proof of Maxwell saturation for coupled scalar recursions,” submitted to IEEE Trans. on Inform. Theory, 2013. arxiv

  • “Spatial coupling, potential functions, and the Maxwell construction” slides
    Annual Workshop on Information Theory and its Applications, San Diego, CA, Feb. 2013

  • “From BP to MAP via Spatial Coupling” slides
    Wireless Networking and Communication Seminar, University of Texas, Austin, Nov. 9th, 2012

  • “A Simple Proof of Threshold Saturation for Coupled Scalar Recursions” slides
    Stanford University, August 17th, 2012,
    Summer Research Institute, Ecole Polytechnique Federale de Lausanne, June 7th, 2012.

  • S. Kumar, A. J. Young, N. Macris, and H. D. Pfister, “A proof of threshold saturation for spatially-coupled LDPC codes on BMS channels,” in Proc. Annual Allerton Conf. on Commun., Control, and Comp., 2012.

  • A. Yedla, Y. Y. Jian, P. S. Nguyen, and H. D. Pfister, “A simple proof of threshold saturation for coupled vector recursions,” in Proc. IEEE Inform. Theory Workshop, 2012. arxiv

  • A. Yedla, Y.-Y. Jian, P. S. Nguyen, and H. D. Pfister, “A simple proof of threshold saturation for coupled scalar recursions,” in Proc. Int. Symp. on Turbo Codes & Iterative Inform. Proc., 2012. arxiv

  • Y.-Y. Jian, H. D. Pfister, and K. R. Narayanan, “Approaching capacity at high rates with iterative hard-decision decoding,” in Proc. IEEE Int. Symp. Inform. Theory, pp. 2696–2700, 2012. arxiv

Universality of Spatially-Coupled Codes for Multiuser Problems

BP Achievable Channel Parameter Region for Noisy Slepian-Wolf Problem using Spatially Coupled LDPC Codes

A noisy Slepian-Wolf problem is considered where two correlated sources are separately encoded and transmitted over two independent binary memoryless symmetric channels. Each channel capacity is assumed to be characterized by a single parameter which is not known at the transmitter. The receiver has knowledge of both the source correlation and the channel parameters. We call a system universal if it retains near-capacity performance without channel knowledge at the transmitter.

Kudekar et al. recently showed that terminated low-density parity-check (LDPC) convolutional codes (a.k.a. spatially-coupled LDPC ensembles) can have belief-propagation thresholds that approach their maximum a-posteriori thresholds. This was proven for binary erasure channels and shown empirically for binary memoryless symmetric channels. They also conjectured that the principle of spatial coupling is very general and the phenomenon of threshold saturation applies to a very broad class of graphical models.

In this research, we derive an area theorem for the joint decoder and empirically show that threshold saturation occurs for this problem. As a result, we demonstrate near-universal performance for this problem using the proposed spatially-coupled coding system. A similar result has also been obtained for the 2-user multiple-access channel.


  • P. S. Nguyen, A. Yedla, H. D. Pfister, and K. R. Narayanan, “On the maximum a posteriori decoding thresholds of multiuser systems with erasures,” in Proc. IEEE Int. Symp. Inform. Theory, (Cambridge, MA), pp. 2711–2715, July 2012.

  • A. Yedla, P. S. Nguyen, H. D. Pfister, and K. R. Narayanan, “Universal codes for the Gaussian MAC via spatial coupling,” in Proc. Annual Allerton Conf. on Commun., Control, and Comp., (Monticello, IL), Sept. 2011. paper slides

  • A. Yedla, H. Pfister, and K. Narayanan, “Universality for the noisy Slepian-Wolf problem via spatial coupling,” in Proc. IEEE Int. Symp. Inform. Theory , (St. Petersburg, Russia), July 2011. paper slides

Combining Queueing and Coding for Finite-State Channels

Combined state diagram of queue and Gilbert-Elliott Channel

This research considers the relationship between code-rate selection and queueing performance for communication systems subject to time-varying channel conditions. While error-correcting codes offer protection against channel uncertainties, there exists a natural tradeoff between the enhanced protection of low-rate codes and the rate penalty imposed by additional redundancy. In the limiting regime where codewords are asymptotically long, this tradeoff is well-understood and characterized by the Shannon capacity. However, for delay-sensitive communication systems and finite block lengths, a complete characterization of this tradeoff is not fully developed. This research offers a new perspective on the queueing performance of communication systems with finite block lengths operating over correlated erasure channels. A rigorous framework that links code rate to overall system performance for random codes is presented and guidelines are identified for code-rate selection in delay-sensitive systems.


  • S. Kumar, J.-F. Chamberland, and H. D. Pfister, “First-passage time and large-deviation analysis for erasure channels with memory,” in IEEE Trans. Inform. Theory, vol. 59, pp. 5547-5565, Sept. 2013.

  • S. Kumar, J.-F. Chamberland, and H. D. Pfister, “Large deviations on empirical service for erasure channels with memory,” in Proc. Annual Allerton Conf. on Commun., Control, and Comp., (Monticello, IL), Sept. 2011. paper slides

  • F. Hamidi-Sepehr, H. D. Pfister, and J. F. Chamberland, “On the queueing behavior of Gilbert-Elliott channels in the rare-transition regime,” in Proc. Conf. on Inform. Sciences and Systems, 2012.

  • F. Hamidi-Sepehr, Y. Cai, H. D. Pfister, and J.-F. Chamberland, “Queueing behavior of the Gilbert-Elliott channel: BCH codes and Poisson arrivals,” in Proc. IEEE Int. Symp. Inform. Theory , (St. Petersburg, Russia), July 2011. paper

  • P. Parag, J.-F. Chamberland, H. D. Pfister, and K. R. Narayanan, “Code-rate selection, queueing behavior, and the correlated erasure channel,” in IEEE Trans. on Inform. Theory, vol. 59, pp. 397-407, Jan. 2013. arxiv

  • P. Parag, J.-F. Chamberland, H. D. Pfister, and K. R. Narayanan, “On the queueing behavior of random codes over a Gilbert-Elliott erasure channel,” in Proc. IEEE Int. Symp. Inform. Theory , (Austin, TX), pp. 1798-1802, June 2010. paper slides

  • P. Parag, J.-F. Chamberland, H. D. Pfister, and K. R. Narayanan, “Code rate, queueing behavior and the correlated erasure channel,” in Proc. IEEE Inform. Theory Workshop , (Cairo, Egypt), Jan. 2010. paper slides

Spatially-Coupled Codes for ISI Channels

Potential function for (3,6) LDPC code

Recently, it has been observed that terminated low-density-parity-check (LDPC) convolutional codes (or spatially-coupled codes) appear to approach the capacity universally across the class of binary memoryless channels. This is facilitated by the threshold saturation effect whereby the belief-propagation (BP) threshold of the spatially-coupled ensemble is boosted to the maximum a-posteriori (MAP) threshold of the underlying constituent ensemble.

We consider the universality of spatially-coupled codes over intersymbol-interference (ISI) channels under joint iterative decoding. More specifically, we empirically show that threshold saturation also occurs in this case. This can be observed by identifying the EXIT/GEXIT curves that naturally obey the general area theorem. From these curves, the corresponding MAP and the BP thresholds are then numerically obtained and threshold saturation is observed.


  • P. S. Nguyen, A. Yedla, H. D. Pfister, and K. R. Narayanan, “Spatially-coupled codes and threshold saturation on intersymbol-interference channels,” to be submitted to IEEE Trans. on Inform. Theory, 2012. arxiv

  • P. S. Nguyen, A. Yedla, H. D. Pfister, and K. R. Narayanan, “Threshold saturation of spatially-coupled codes on intersymbol-interference channels.” IEEE Int. Conf. Commun., Jan. 2012. paper slides

  • C. Wang and H. D. Pfister, “Upper bounds on the MAP threshold of iterative decoding systems with erasure noise,” in Proc. Int. Symp. on Turbo Codes & Related Topics, Lausanne, Switzerland, Sep. 2008. paper slides (this pdf has been updated to correct a typo in Lemma 3)

Linear Programming Decoding of LDPC codes on Finite-State Channels

Potential function for (3,6) LDPC code

This research considers the joint-decoding problem for finite-state channels (FSCs) and low-density parity-check (LDPC) codes. First, the linear-programming (LP) decoder for binary linear codes is extended to joint-decoding of binary-input FSCs. In particular, we provide a rigorous definition of LP joint-decoding pseudo-codewords that enables evaluation of the pairwise error probability between codewords and pseudocodewords in AWGN. This leads naturally to a provable upper bound on decoder failure probability. If the channel is a finite-state intersymbol interference channel, then the joint LP decoder also has the maximum-likelihood (ML) certificate property and all integer-valued solutions are codewords. After deriving these results, we discovered some elements were equivalent to earlier work by Flanagan on linear-programming receivers. In the second part, we develop an efficient iterative solver for the joint LP decoder discussed in the first part.

Next, we extend the approach of iterative approximate LP decoding, proposed by Vontobel and Koetter and analyzed by Burshtein, to this problem. By taking advantage of the dual-domain structure of the joint-decoding LP, we obtain a convergent iterative algorithm for joint LP decoding whose structure is similar to BCJR-based turbo equalization (TE). The result is a joint iterative decoder whose per-iteration complexity is similar to that of TE but whose performance is similar to that of joint LP decoding. The main advantage of this decoder is that it appears to provide the predictability of joint LP decoding and superior performance with the computational complexity of TE. One expected application is coding for magnetic storage where the required block-error rate is extremely low and system performance is difficult to verify by simulation.


  • Byung-Hak Kim and H. D. Pfister, “Joint decoding of LDPC codes and finite-state channels via linear programming,” in IEEE J. Select. Topics in Signal Proc., vol. 5, pp. 1563-1576, Dec. 2011.

  • B.-H. Kim and H. D. Pfister, “An iterative joint linear-programming decoding of LDPC codes and finite-state channels,” in Proc. IEEE Int. Conf. Commun., June 2011. arxiv

  • B.-H. Kim and H. D. Pfister, “On the joint decoding of LDPC codes and finite-state channels via linear programming,” in Proc. IEEE Int. Symp. Inform. Theory , (Austin, TX), pp. 754-758, June 2010. paper slides

Soft-Decision Decoding via Multiple Decoding Attempts for Reed-Solomon Codes

system diagram for multiple decoding trials

One popular approach to soft-decision decoding of Reed-Solomon (RS) codes is based on using multiple trials of a simple RS decoding algorithm in combination with erasing or flipping a set of symbols or bits in each trial. We consider a novel framework based on rate-distortion (RD) theory to analyze these multiple-decoding algorithms. By defining an appropriate distortion measure between an error pattern and an erasure pattern, the successful decoding condition, for a single errors-and-erasures decoding trial, becomes equivalent to distortion being less than a fixed threshold.

diagram showing error patterns covered by decoder erasure paterns

Finding the best set of erasure patterns also turns into a covering problem that can be solved asymptotically by RD theory. Thus, the proposed approach can be used to understand the asymptotic performance-versus-complexity tradeoff of multiple errors-and-erasures decoding of RS codes. This initial result is also extended a few directions. The rate-distortion exponent (RDE) is computed to give more precise results for moderate blocklengths. Multiple trials of algebraic soft-decision (ASD) decoding are analyzed using this framework. Analytical and numerical computations of the RD and RDE functions are also presented. Finally, simulation results show that sets of erasure patterns designed using the proposed methods outperform other algorithms with the same number of decoding trials.


  • P. S. Nguyen, H. D. Pfister, and K. R. Narayanan, “On multiple decoding attempts for Reed-Solomon codes: A rate-distortion approach,” in IEEE Trans. on Inform. Theory, vol. 57, pp. 668-691, Feb. 2011. arxiv

  • P. S. Nguyen, H. D. Pfister, and K. R. Narayanan, “A rate-distortion exponent approach to multiple decoding attempts for Reed-Solomon codes,” in Proc. IEEE Int. Symp. Inform. Theory, (Austin, TX), pp. 1798-1802, June 2010. paper slides

  • P. S. Nguyen, H. D. Pfister, and K. R. Narayanan, “A rate-distortion perspective on multiple decoding attempts for Reed-Solomon codes,” in Proc. 47th Annual Allerton Conf. on Commun., Control, and Comp., (Monticello, IL), pp. 1235-1242, Oct. 2009. paper slides

Compressed Sensing and Coding

Encoder Tanner Graph for Compressed Sensing

This work began with the observation that compressed sensing can be seen as real-number extension of syndrome source-coding. Since coding over the real numbers is akin to using a very large alphabet, we applied coding techniques designed for large alphabets to compressed sensing. In particular, we used low-density measurement matrices with verification decoding and were able to derive rigorous sparsity thresholds based on density evolution.


  • “The effect of spatial coupling on compressive sensing,” in Proc. Annual Allerton Conf. on Commun., Control, and Comp., (Monticello, IL), 2010. arxiv

  • “Information Theory and Coding for Compressed Sensing” slides
    Center for Wireless Communications Ericsson Seminar
    University of California, San Diego, March 2, 2009.

  • F. Zhang and H. D. Pfister, “Verification decoding of high-rate LDPC codes with applications in compressed sensing,” in IEEE Trans. on Inform. Theory, vol. 58, pp. 5042-5058, Aug. 2012. arxiv

  • F. Zhang and H. D. Pfister, “Analysis of verification-based decoding on the q-ary symmetric channel for large q,” in IEEE Trans. on Inform. Theory, vol. 57, pp. 6754-6770, Oct. 2011. arxiv

  • F. Zhang and H. D. Pfister, “On the iterative decoding of high rate LDPC codes with applications in compressed sensing,” in Proc. 46th Annual Allerton Conf. on Commun., Control, and Comp., Monticello, IL, Sep. 2008. paper slides
    S. Kudekar and H. Pfister,

  • F. Zhang and H. D. Pfister, “Compressed Sensing and Linear Codes over Real Numbers,” in Proc. 2008 Workshop on Inform. Theory and Appl., UCSD, La Jolla, CA, Feb. 2008. paper slides

Achieving Capacity with Bounded Complexity

Tanner graph of an accumulate repeat accumulate code

This work provided the first capacity-achieving code ensembles for the binary erasure channel whose iterative decoding complexity (per information bit) is bounded as the gap to capacity vanishes. First, we designed irregular repeat accumulate (IRA) codes where all systematic bits were punctured. Later, we found a systematic construction based on accumulate repeat accumulate (ARA) codes.


  • “Capacity-Achieving Codes for the BEC with Bounded Complexity” slides
    Texas A&M University, April 2006.

  • H. D. Pfister and I. Sason, “Accumulate-repeat-accumulate codes: Capacity-achieving ensembles of systematic codes for the erasure channel with bounded complexity,” in IEEE Trans. on Inform. Theory, vol. 53 (6), pp. 2088-2115, June 2007. paper

  • H. D. Pfister and I. Sason, “Capacity-achieving ensembles of accumulate-repeat-accumulate codes for the erasure channel with bounded complexity,” in Proc. 2006 Workshop on Inform. Theory and Appl., UCSD, La Jolla, CA, Feb. 2006. paper slides

  • H. D. Pfister, I. Sason, and R. Urbanke, “Capacity-achieving ensembles for the binary erasure channel with bounded complexity,” in IEEE Trans. on Inform. Theory, vol. 51 (7), pp. 2352-2379, July 2005. paper

  • H. Pfister, I. Sason and R. Urbanke, “Bounds on the decoding complexity of punctured codes on graphs,” in Proc. 42nd Allerton Conf. on Comm., Control and Computing, Monticello, Illinois, pp. 481-491, Sept. 2004. paper

On the Capacity of Finite-State Channels

i.i.d. capacity of some partial-response recording channels in AWGN

This work is based on a simple Monte Carlo algorithm that estimates information rates of finite-state intersymbol-interference channels. Before this, the best approach was very complex and yielded only loose bounds. This algorithm (which was also discovered independently by Arnold and Loeliger) stimulated enough interest in this area that, a few years later, these capacities can now be computed to almost any desired accuracy. This work was later extended to provably design irregular LDPC codes that achieve the i.i.d. capacity of some erasure channels with memory.


  • H. D. Pfister, “The capacity of finite-state channels in the high-noise regime,” in Entropy of Hidden Markov Processes and Connections to Dynamical Systems (B. Marcus, K. Petersen, and T. Weissman, eds.), Cambridge University Press, 2011. arxiv

  • H. D. Pfister and P. H. Siegel, “Joint iterative decoding of LDPC codes for channels with memory and erasure noise,” in IEEE J. Select. Areas Commun., vol. 26 (2), pp. 320-327, Feb. 2008. paper

  • J. B. Soriaga, H. D. Pfister and P. H. Siegel, “Determining and approaching achievable rates of binary intersymbol interference channels using multistage decoding,” in IEEE Trans. on Inform. Theory, vol. 53 (4), pp. 1416-1429, April 2007. (IEEE COMSOC Data Storage Best Paper 2008) paper

  • J. B. Soriaga, H. D. Pfister, and P. H. Siegel, “On the low-rate Shannon limit for binary intersymbol interference channels,” in IEEE Trans. on Communications, vol. 51 (12), pp. 1962-1964, Dec. 2003. paper

  • “On the Capacity of Finite-State Channels” slides
    Technion Institute, Haifa, Israel, March 2005.

  • H. D. Pfister and P. H. Siegel, “Joint iterative decoding of LDPC codes and channels with memory,” in Proc. 3rd Intl. Symp. on Turbo Codes and Related Topics, Brest, France, pp. 15-18, Sep. 2003. paper slides

  • H. D. Pfister, “The Capacity of Finite State Channels”, PhD Thesis Chapter 4, 2003. pdf

  • H. D. Pfister, “Joint Iterative Decoding of LDPC Codes and Channels with Memory”, PhD Thesis Chapter 5, 2003. pdf

  • H. D. Pfister, J. B. Soriaga and P. H. Siegel, “On the achievable information rates of finite state ISI channels,” in Proc. IEEE Globecom, San Antonio, TX, pp. 2992-2996, Nov. 2001. paper slides