Coset Codes-Part I: Introduction and Geometrical Classification
Coset Codes-Part I: Introduction and Geometrical Classification
G. DAVID FORNEY, JR., FELLOW, IEEE
Invited Paper
Abstract
Practically all known good constructive coding techniques for band-limited channels, including lattice codes and various recently proposed trellis-coded modulation schemes, can be characterized as coset codes. A coset code is defined by a lattice partition and by a binary encoder that selects a sequence of cosets of the lattice .
The fundamental coding gain of a coset code, as well as other important parameters such as the error coefficient, the decoding complexity, and the constellation expansion factor, are purely geometric parameters determined by and .
The known types of coset codes, as well as a number of new classes that systematize and generalize known codes, are classified and compared in terms of these parameters.
I. Introduction
A. History
THE FACT that the channel capacity of band-limited channels with white Gaussian noise is some 9 dB beyond what can be achieved with simple pulse amplitude modulation (PAM) was an immediate consequence of Shannon's original work (see [1]). It is therefore remarkable that approximately three decades passed before serious work began on developing constructive coding techniques that could achieve sizable fractions of this potential gain.
The field was not completely inactive during this period. Shannon's work had indicated that there must be sphere packings in spaces of high dimension with sufficiently high density to approach channel capacity. In the 1950's and 1960's mathematicians developed some constructive dense sphere packings based on lattices in spaces of moderate to high dimension, notably the -dimensional Barnes-Wall lattices [2], and the ultradense 24-dimensional Leech lattice [3]. The earliest advocate of the use of lattices for communications appears to have been Lang in Canada in the early 1960's (see, e.g., [4, preface]). In an interesting interplay between mathematics and communications, it seems that Lang's calculations of bounds on maximum lattice density for dimensions helped to motivate Leech to
[^0]discover his now-famous lattice [3, p. 265]. Another longtime proponent of lattices in communications, also in Canada, has been deBuda, who proved that the coding theorem applies to lattice codes [5]. Connections between lattice theory and coding theory were made by Leech and Sloane [6], and Sloane has since continued to develop bridges between these disciplines. This work is authoritatively and comprehensively summarized in [7].
It is probably fair to say, however, that it was the trellis-coded modulation schemes of Ungerboeck [8] that captured the attention of the modulation community and inspired widespread practical application as well as intensified research. Through a technique called "mapping by set partitioning" of one- and two-dimensional signal constellations, combined with binary convolutional codes, Ungerboeck showed how coding gains of the order of 3 dB could be obtained with simple four-state codes, while gains of up to 6 dB could be obtained with more complex (128-state) codes. A variant [9] of Ungerboeck's eight-state two-dimensional scheme has been adopted in international standards for both private-line modems and switched-network modems and is coming into wide commercial use.
More recently, trellis-coded modulation schemes using multidimensional signal constellations have been developed. A simple four-dimensional scheme of Gallager was presented by Forney et al. in [1], and a similar scheme was discovered independently by Calderbank and Sloane [10]. Wei [11] has developed a class of multidimensional schemes that are highly suited for implementation, one of which is used in a Codex 19.2-kbit/s modem. Calderbank and Sloane [12], [13] have also developed a variety of classes of new trellis codes.
In an earlier paper [1] Forney et al. pointed out that all schemes known at that time, including the most important lattice codes, could be generated by the same basic elements:
- a conventional binary encoder, block or convolutional, operates on certain of the data bits to provide a larger number of coded bits;
- these coded bits select one of the subsets of a partitioned signal constellation;
- additional uncoded bits select an individual signal point from the selected subset.
This way of looking at coding schemes has several important consequences:
- The code distance properties, and thus the fundamental coding gain, are determined by the binary encoder and the subset partitioning, which are largely decoupled from the choice of signal constellation in the third step. In this last step there is a trade-off between optimal shaping of the signal constellation and implementation simplicity, but this is almost independent of the fundamental coding scheme and has only a minor effect on the overall coding gain. Finally, in decoding, the first operation can always be to determine the best signal point in each subset and its metric; after that step, decoding depends again only on the fundamental code structure determined by the first two encoding operations.
- Different classes of codes can be readily compared and contrasted in this common framework. In this paper, which is to a large extent a sequel to [1], we concern ourselves only with the code structure imposed by the first two encoding operations, regarding the constellation shaping as peripheral; in our view this clarifies the similarities and differences between various schemes.
B. Introduction to Coset Codes
Calderbank and Sloane [13] have made the important observations that the signal constellation should be regarded as a finite set of points taken from an infinite lattice, and that the partitioning of the constellation into subsets corresponds to the partitioning of that lattice into a sublattice and its cosets. This lattice/coset language is both illuminating and powerful, and we have found that all of the good coded modulation schemes mentioned above can be put into this framework. (These codes may be more specifically characterized as lattice-type coset codes, which are the only type considered in this paper, apart from a brief mention of phase-modulated coset codes in the next section.)
We call this general class of coded modulation schemes coset codes. They seem to provide a general approach to the construction of implementable codes for band-limited channels that approach channel capacity, just as conventional codes (both block and convolutional) do for the power-limited case.
We now give a quick preview of the elements of coset codes, and of key terms and concepts that will figure in the rest of the paper. Fig. 1 illustrates the general structure of an encoder for a coset code, embodying the three principal elements just described and using the language of lattices
Fig. 1. General structure of encoder for coset code .
and cosets (compare [1, fig. 10] or [13, fig. 2]). The main ingredients are as follows:
- An -dimensional lattice , which we can think of as an infinite regular array of points in -space. The signal points will all be taken from a finite subset of points lying within a translate (coset) of , and the set of all possible signal points is called the signal constellation.
- A sublattice of , i.e., a subset of the points of which is itself an -dimensional lattice. The sublattice induces a partition of into cosets of , where is the order of the partition; when and are binary lattices, the order of the partition is a power of 2 , say , and correspondingly, the partition divides the signal constellation into subsets, each corresponding to a distinct coset of .
- A rate- binary encoder , which takes in bits per dimensions and puts out coded bits; the latter select one of the cosets of in the partition . The redundancy of is bits per dimensions; the normalized redundancy per two dimensions is .
The coset code is the set of all sequences of signal points that lie within a sequence of cosets of that could be specified by a sequence of coded bits from . Some lattices, including the most useful ones, can be generated as lattice codes , where and are lattices of lower dimension, and is a binary block code. If is a convolutional encoder, then is a trellis code.
The fundamental coding gain of the coset code is denoted by and is defined by two elementary geometrical parameters: the minimum squared distance between signal point sequences in and the fundamental volume per dimensions, which is equal to where the redundancy is equal to the sum of the redundancy of the encoder and the redundancy of the lattice . In fact,
where the normalized redundancy (per two dimensions) is equal to .
To transmit bits per dimensions, the signal constellation must consist of points from a coset of , partitioned into subsets, each consisting of points from a different coset of . Given a selected coset of , uncoded bits select a particular signal point from that coset. The constellation expansion factor (compared to an uncoded constellation of points from a coset of ) is thus per dimensions, or per two dimensions. This translates into an average power cost of a factor of (or ), which is reflected in the formula for the fundamental coding gain just given.
The total coding gain is the product of the fundamental coding gain with the shape gain of the finite constellation ( is defined as and is approximately equal to the ratio of the normalized second moment [7] of an -cube to that of the region of -space in which the constellation is contained). If the constella- tion is the set of points in a coset of that lie within an -cube, and . If the constellation is chosen more like an -sphere to reduce average power, measures this reduction. Thus , unlike , reflects finite constellation effects. Even for the same code will in general vary with , unlike . is usually much smaller than , being upper-bounded by the shape gain of an -sphere (see next paragraph); however, calculating (or ) is usually more cumbersome than calculating . Finally, is determined not by but by the choice of constellation boundary; in general, a similar gain can be achieved in uncoded systems by choosing a similar boundary for an uncoded constellation in dimensions. For all these reasons we prefer to focus on the fundamental coding gain in this paper. (Multidimensional constellations will be considered in [14].) We feel that the various values for that have appeared in the prior literature have confused rather than clarified the fundamental properties of these codes and the comparisons between them. (Calderbank and Sloane [13] also calculate an asymptotic coding gain whose value is independent of finite constellation effects. This gain is the combination of our "fundamental coding gain" and the shape gain of an -sphere over an -cube, which for even is , where ; thus dB ) for for for , with a limit of as [1].)
As a simple example of a coset code, let us consider the four-state two-dimensional Ungerboeck code illustrated in Fig. 2, transmitting 5 bits per two dimensions with the "square" 64 -point constellation of Fig. 3(a). In this case the lattice is the two-dimensional integer lattice , i.e., the set of all integer 2-tuples (the signal constellation is actually chosen from its translate for symmetry). Thus the scale is such that the minimum (squared) distance between points in the constellation is one. The sublattice is the two-dimensional lattice , i.e., the set of all even integer 2 -tuples. The partition has order 4, i.e., is the union of four cosets of , which correspond to the points labeled , and in Fig. 3. There are 16 points in the constellation from each of the four cosets. The minimum squared distance between points in any coset of is four. The encoder is a rate-1/2 four-state convolutional encoder. Thus the redundancy per two dimensions is . One of the five input bits per two dimensions goes into the encoder, and the two resulting coded bits select one of the
Fig. 2. Four-state two-dimensional Ungerboeck code ( ).
Fig. 3. Two 64 -point signal constellations based on partition . (a) Square. (b) Cross.
four cosets , and ; the remaining four uncoded bits select one of the 16 points from the selected coset.
As we show below, can be chosen so that ; since (the integer lattice has zero redundancy), the fundamental coding gain is (the minimum squared distance gain is a factor of four, but this is offset by a constellation expansion power cost of a factor of two, leaving a net gain of a factor of 2). Because the constellation is square, this is also the total coding gain. The total coding gain could be improved slightly by use of a more circular constellation, e.g., the 64 -point "cross" constellation of Fig. 3(b), which is about 0.1 dB better [1].
C. Other Coset Codes
The coset codes described in this paper are based on partitions of binary lattices. More generally, a coset code can be defined whenever is some set of discrete elements that forms an algebraic group, with some distance measure between elements of is a subgroup of such that the quotient group has finite order , and is an appropriate code whose codewords select sequences of cosets of in the partition .
For instance, can be a binary block code, with a subcode, and Hamming distance as the distance measure. We shall see some examples of the construction of convolutional codes in this way in this paper (other such constructions appear in [15]). In [18] we shall see how such codes as the Reed-Muller and Golay codes can be built up from short codes in this way. In [16], we shall show how ternary codes, lattices, and trellis codes can be constructed as coset codes, where and are ternary block codes or lattices, and is a block or convolutional code over the ternary field .
Finally, phase-modulated codes can be constructed as coset codes as follows. The signal constellation for -ary PSK ( PSK) can be regarded as the complex th roots of unity, which forms a group under complex multiplication. If divides , the PSK constellation is a subgroup T. Thus, for example, 16PSK/8PSK/4PSK/2PSK/1PSK is a chain of two-way partitions. Although the minimum squared distances within these constellations are somewhat different from those in lattice partitions (e.g., for these constellations, compared to 1/2/4/8/16 for the comparable two-dimensional lattice partition ), similar constructions to those presented here often yield good phase- modulated codes (e.g., the phase-modulated codes of Ungerboeck [8] or of LaFanchere et al. [17]).
In general, these code constructions rely very little on the linearity properties of the groups (e.g., lattices, sublattices) on which they are based, and the codes so constructed are often not linear, particularly the trellis codes. The essential properties of these sets seem to be their partition structure and related distance properties, which of course were the basis for Ungerboeck's constructions via 'mapping by set partitioning.' The primary benefit of starting with sets that are groups seems to be that their subgroups naturally induce useful partitions via coset decompositions.
D. Outline
The primary subject of this paper is the categorization of various lattice-type coset codes in terms of their key parameters and, ultimately, in terms of their performance as measured by their fundamental coding gains.
Since these codes are based on partitions of binary lattices, we begin with an introduction to such lattices in Section II, which is intended to be self-contained. A more comprehensive introduction to this family of lattices is given in the companion paper [18], hereafter referred to as part II. (By far the best general reference on lattices is the forthcoming encyclopedic book by Conway and Sloane [7].) Section III summarizes those results from part II most relevant to this paper, particularly the performance of lattices as lattice codes, their partition/distance structure, and the decoding complexity of lattices and lattice partitions as measured by the number of binary operations of the trellis-based decoding algorithms given in part II.
Section IV then extends lattice theory to trellis codes. Section V characterizes the principal trellis codes that have been introduced to date as coset codes. Section VI introduces some generic classes of trellis codes whose main parameters can be easily determined and which include codes similar to (and in many cases equivalent to) the principal known codes. Finally, Section VII compares and contrasts all of these codes in terms of performance vs. complexity.
It is intended that this paper and part II may be read independently; as a result, there is some overlap. The reader who desires to read both papers in the most logical order is advised to skim this paper quickly through Section II, omitting proofs; then to read part II, with primary focus on the material relating to Barnes-Wall lattices; and then to return to the rest of this paper. The mathematical level is kept as elementary as possible; for the more mathematically inclined reader, we recommend learning about lattices by reading Conway and Sloane [7].
II. A Lattice Primer
A. Definitions
A real lattice is simply a discrete set of vectors (points, -tuples) in real Euclidean -space that forms a group under ordinary vector addition, i.e., the sum or difference of any two vectors in is in . Thus necessarily includes the all-zero -tuple , and if is in , then so is its additive inverse . The vectors in a lattice may possibly span fewer than dimensions; however, this will not be the case for any lattice considered here, so there will be no confusion if we call a lattice of real -tuples an -dimensional real lattice.
As an example, the set of all integers is essentially the only one-dimensional real lattice, up to scaling, and the prototype of all lattices. The set of all integer -tuples is an -dimensional real lattice for any .
Lattices have only two principal structural characteristics. Algebraically, a lattice is a group; this property leads to the study of subgroups (sublattices) and partitions (coset decompositions) induced by such subgroups. Geometrically, a lattice is endowed with the properties of the space in which it is embedded, such as the Euclidean distance metric and the notion of volume in . The following two sections are concerned with these two aspects of lattice structure.
Lattices closely related to a given real -dimensional lattice are obtained by the following operations.
- Scaling: If is any real number, then is the lattice consisting of all multiples of vectors in by the scalar .
- Orthogonal Transformation: More generally, if is any scaled orthogonal transformation of -space, then is the lattice consisting of all transformations of vectors in by . We say that is a version of .
- Cartesian Product: The -fold Cartesian product of with itself-i.e., the set of all -tuples ( ) where each is in -is an -dimensional lattice denoted by .
For example, is the -fold Cartesian product of with itself, and is a scaled version of for any and . The two-dimensional lattice is illustrated in Fig. 4.
Fig. 4. Lattice and its sublattice (black dots).
The most important scaled orthogonal transformation for our purposes is the rotation operator , defined by the matrix
is a version of obtained by rotating by and scaling by and is also illustrated in Fig. 4. The points in are a subset of the points in , meaning that is a sublattice of . Note that , where is the identity operator (in two dimensions), so that .
We can define a -dimensional rotation operator by letting operate on each pair of coordinates in a tuple; with a slight abuse of notation, we denote by any such rotation operator. For instance, in four dimensions,
Note that for any , where is the identity operator in dimensions, so that for any real -dimensional lattice .
B. Group Properties
A coset of a lattice , denoted by , is the set of all -tuples of the form , where is any point in and is some constant -tuple that specifies the coset. Geometrically, the coset is therefore a translate of by (if is in , then ). Two -tuples are equivalent modulo if their difference is a point in . Thus the coset is the set of all points equivalent to modulo .
A sublattice of a lattice is a subset of the elements of that is itself a lattice, i.e., is a subgroup of the additive group . Thus, by elementary group theory, a sublattice induces a partition (denoted by ) of into equivalence classes modulo (the equivalence classes may be added modulo and form the quotient group ). We shall say that the order of the partition (or quotient group) is the number of such equivalence classes (in the mathematical literature, is usually called the index of in ). Each equivalence class is a coset of (one being itself), or, geometrically, a translate of . For example, the partition has order , and Fig. 4 illustrates as the union of two cosets of . Of course, any -dimensional integer lattice is a sublattice of .
If we take one element from each equivalence class, we obtain a system of coset representatives for the partition , denoted by . (In general, there are many ways of selecting such a system , so the notation does not entirely specify the system.) Then every element of can be written uniquely as a sum , where is the coset representative of the equivalence class in which lies, and is an element of (because ). This is called a coset decomposition of and will be written here as
For example, the two 2 -tuples and are a system of coset representatives for the partition , and every element of may be written as the sum of one of these two 2 -tuples with an element of , i.e., is the union of and (the black dots and white dots in Fig. 4, respectively).
As another example, if is any integer, the lattice of integer multiples of is a sublattice of . The partition is the partition of the integers into equivalence classes modulo (modulo ), and the order of the partition is . The integers form a system of coset representatives for the partition , and every integer can be written uniquely as , where is an element of and (thus is essentially the ring of integers modulo . In particular, the partition has order 2 and divides the integers into two subsets, (the even integers) and (the odd integers).
More generally, for any , the lattice of all -tuples of integer multiples of is a sublattice of of order , and is a system of coset representatives for ; hence .
A partition also induces a coset decomposition of any coset of , say ; for .
A partition chain is a sequence of lattices such that each is a sublattice of the previous one (in other words, ). For example, is an infinite sequence of two-way partitions of the integers.
A partition chain induces a multiterm coset decomposition chain, with a term corresponding to each partition; e.g., if is a partition chain, then
meaning that every element of can be expressed as an element of plus a coset representative from plus a coset representative from . For example, the chain leads to the standard binary representation of an integer :
where , and specifies the coset in the partition specifies the coset in the partition , and so forth. That is,
For a related example with a finite chain, we can specify one of the eight cosets of (one of the equivalence classes of integers modulo 8 ) by 3 bits ( ), where is the standard binary representation of the coset representative .
We may illustrate such a decomposition chain by a partition tower, as shown in Fig. 5(a). Each block in the tower represents one partition in the chain, and the input to that block is a variable which selects one of the cosets of in that partition (or, equivalently, one of the coset representatives in ). The standard bi-
Fig. 5. Partition towers illustrating coset decomposition chains induced by lattice partition chains. (a) . (b) .
nary representation is illustrated in this way in Fig. 5(b) (note that the "least significant bit" appears at the top).
C. Geometric Properties
The geometry of a real lattice arises from the geometry of a real Euclidean -space . The two principal geometrical parameters of are the minimum squared distance between its points and its fundamental volume ; these determine its fundamental coding gain .
The norm of a vector in is the sum of the squares of its coordinates. Norms are nonnegative and in fact nonzero unless . The squared distance between two vectors and is the norm of their difference .
Because a lattice consists of discrete points, the norms of all lattice points are an infinite set of discrete values that can be enumerated in ascending order We call this the weight distribution of the lattice (theta series, in the lattice literature). The weight distribution is also the squared distance distribution between any point in the lattice and all other points, since any point in can be taken as the origin by translation of by (looking out from any point in , the lattice looks the same).
The minimum nonzero norm is thus the minimum squared distance between any two points in . The number of elements of with this norm is the number of nearest neighbors of any lattice point (also called the kissing number, or multiplicity), and will be called here the error coefficient .
For example, for any , the integer lattice has . The set of all integer -tuples of norm 1 is the set of all permutations and sign changes of the vector , so .
Loosely, the fundamental volume is the volume of -space per lattice point, or the reciprocal of the number of lattice points per unit volume. More precisely, if we can partition -space into regions of equal volume, one associated with each lattice point, then is the volume of each such region. For example, it is easy to see that we may partition -space into -cubes of side 1, one associated with each point of , so .
To treat the general case, note that is itself a group under ordinary vector addition (but not a lattice), because its points are not discrete). Any real -dimensional lattice is a subgroup of . Thus there is a partition of -space into equivalence classes modulo (cosets of ) (in our original definition of a coset of , implicitly we meant a coset in the partition ). Define a fundamental region as a region of -space that contains one and only one point from each such equivalence class modulo ; thus is a system of coset representatives for the partition . Every point in is thus uniquely representable as , where and , i.e., there is a coset decomposition . Geometrically, this is a tesselation of -space by translates of fundamental regions of . While there is no unique fundamental region, every fundamental region must have the same volume (if it is measurable), since it is congruent to any other fundamental region modulo ; this uniquely defines the fundamental volume .
For example, one fundamental region of the onedimensional integer lattice is the half-open interval ; another is the half-open interval ( ]. Whatever fundamental region we take, however, its volume (length) must be one. Similarly, for , we may take as the half-open -cube or as ; again the volume is one for any .
Happily, the computation of fundamental volumes of an integer lattice may be completely avoided by use of the following lemma, if we know the order of the partition .
Lemma 1: If is a sublattice of of order , then .
Proof: Since only one of every points in is in , the fundamental volume of must be times larger than that of for its fundamental regions to fill -space. In fact, we may take a union of fundamental regions of as a fundamental region of , one associated with each member of a set of coset representatives for , in view of the decomposition chain .
Corollary: If is an integer lattice, then . Notice that Lemma 1 does not essentially depend on being a lattice; as long as is a union of some number of cosets of , the points of are times as dense in -space as those of .
From the two geometrical parameters and , we define the fundamental coding gain of a lattice as follows:
(in the mathematical literature this is called Hermite's parameter and is also denoted by the symbol ). The fundamental coding gain is a normalized measure of the density of a lattice in the following various senses. a) It is dimensionless. Both and have the dimensions of a two-dimensional volume (area). We shall often find that the most appropriate normalization of other parameters is to two dimensions. b) The fundamental coding gain is invariant to scaling, , because and . c) More generally, the fundamental coding gain is invariant to any scaled orthogonal transformation , because and , where is the determinant of . Thus any version of has the same fundamental coding gain. d) The fundamental coding gain is invariant to the Cartesian product operation, , because and . Thus if is an -dimensional lattice and is an -dimensional lattice, then in -space (where they can be compared directly) has a greater or lesser density of points per unit volume than does according to whether is greater or less than , provided that they are scaled so that their minimum squared distances are the same. We therefore say that is denser than if , regardless of whether and have the same dimension. e) For any . An uncoded system may be defined as one that uses constellations based on (e.g., PAM uses constellations based on , and narrow-sense quadrature amplitude modulation (QAM) uses constellations based on ). Thus the fundamental coding gain of an arbitrary lattice may be considered to be the gain using constellations based on over an uncoded system using constellations based on . f) More concretely, suppose we form a -point dimensional constellation by taking all points in an dimensional lattice (or a coset of ) that lie within an -sphere with radius chosen just large enough to enclose the desired number of points. The volume of the sphere must then be about for large . For the integer lattice , the volume of such a sphere must be about . The ratio of the radii of the two spheres is thus about (this dimensional argument in fact holds for constellation boundaries of any shape). If we scale so that , then the minimum distance is the same as for , but we achieve an average power reduction of about , or , by using the constellation based on rather than that based on . Thus is normalized properly to measure a power gain, and we will often give its value in decibels.
For example, is a version of with (the rotation operator always doubles norms, in any number of dimensions). The partition has order 2, and thus . Thus we verify that .
As an example of a denser lattice, the Schläfli lattice may be defined as the four-dimensional integer lattice consisting of all integer 4 -tuples with an even number of odd coordinates or, equivalently, with even norm. The order of the partition is two, because is the union of and its coset (the set of all integer 4-tuples with an odd number of odd coordinates or, equivalently, with odd norm). Thus . Clearly, ; therefore, the fundamental coding gain of is
Thus is denser than or by a factor of (or 1.51 dB ). The elements of norm 2 are the 24 points obtained by permutations and coordinate sign changes of the 4-tuple ( ), so the error coefficient is 24 .
Another way of comparing the density of to that of is the following. The lattice is a version of with (since doubles norms) and with (since ). Moreover, is a sublattice of , which must be of order 2 (by Lemma 1); in fact, is the union of (the lattice of integer 4-tuples with even norms in both pairs of coordinates) and its coset (the set of integer 4-tuples with odd norms in both pairs of coordinates). However, has the same minimum squared distance as . Thus it is possible to take a version of and insert a translate of that version into the interstices between points without reducing the minimum distance. So has twice as many points as per unit volume, with no decrease in ; hence is twice as dense as in four dimensions, or times as dense per two dimensions, which is the normalization used in the definition of .
We see that is a chain of two-way partitions with distances (for short). However, since is a sublattice of , it follows that is a sublattice of is a sublattice of , and so forth. Hence is an infinite chain of two-way partitions, with distances .
D. Complex Lattices and Gaussian Integers
A complex lattice is a discrete set of points in complex Euclidean -space that forms a group under ordinary (complex) vector addition. Again, we stipulate that the only such lattices to be considered here will actually span dimensions, so we shall feel free to call such a an -dimensional complex lattice.
An obvious isomorphism (written ) exists between any -dimensional real lattice and a corresponding -dimensional complex lattice , formed by taking each pair of coordinates of to specify the real and imaginary parts of each coordinate of , or vice versa. Addition of two points gives the same result in either case. Sublattices, cosets, and all such group properties carry over. Even the norm of two corresponding vectors is the same, so distances are not affected. Thus for most purposes it makes no difference whether we consider a lattice to be real or complex. For all parameters previously defined (e.g., ), we may define the values for a complex lattice to be the same as those for the corresponding real lattice.
The only difference of any significance arises when we consider multiplicative operations, such as scaling, or the taking of inner products. A complex lattice may be scaled by either a real number or a complex number , the latter operation involving an equal phase rotation of each coordinate of by the phase of (as well as a scaling of lengths by , or norms by ). The inner product of two real vectors and is the sum of the products of their coordinates and must be real; the (Hermitian) inner product ( ) of two complex vectors and is the sum of the products of the coordinates of with the complex conjugates of the coordinates of and may be complex. Thus there may arise differences in definitions of orthogonality, duality, and so forth. In general, for the lattices considered in this paper, we shall prefer the complex definitions.
The simplest example of a complex lattice is the onedimensional complex lattice corresponding to the twodimensional real lattice . The point ( ) in corre- sponds to the point in , where and may be any pair of integers. The set is called the set of Gaussian integers.
The Gaussian integers actually form a system of complex integers analogous to the ordinary real integers . Multiplication of two elements of (using complex arithmetic) yields another element of , which cannot be 0 unless one of the two elements is 0 (in fact, their norms multiply as real integers). Thus is a ring and, in fact, an integral domain. Indeed, we have unique factorization in : every element of can be expressed uniquely as a product of primes, up to units, where the units (invertible elements) are and , and the primes are the elements that have no divisors other than themselves, up to units. The primes of , in order of increasing norm, are , , with norms . We denote the prime of least norm by . (Note that , and thus two is not a prime in .)
We may scale by any element and obtain a sublattice of . By Lemma 1, the partition must have order (the norm of ). There are thus equivalence classes of modulo .
For example, is a sublattice of of order and, in fact, is the complex lattice corresponding to the real lattice . As with consists of all the elements of with even norm, its coset consists of all the elements of with odd norm, and the union of and is (Fig. 4 may equally well be taken to illustrate this partition of ). The coset representatives may thus be taken as , and are isomorphic to using modulo arithmetic (since ).
More generally, is a sublattice of of order and, in fact, is the complex lattice corresponding to the real lattice , which is equal to for even and for odd. As with consists of all the elements of whose norms are multiples of , and thus . There is then an infinite chain of two-way partitions with distances , corresponding to the real chain . In analogy to the chain , this chain suggests a complex binary representation of a Gaussian integer :
where , and specifies the coset of in the partition specifies the coset of in the partition , and so forth. That is, the complex binary representation is based on the coset decomposition
For any lattice , if is any lattice point and is any integer, then is a lattice point, so is a sublattice of , and (like any additive group) is a module over the ring of ordinary integers. However, a complex lattice is not necessarily a module over the ring of Gaussian integers (for example, the two-dimensional hexagonal lattice is not). It is so if and only if implies ; for then if is any Gaussian integer, is a lattice point. Then is a sublattice of for any . In particular, is a sublattice of ; but since is a sublattice of , in fact . When necessary, we shall call such a complex lattice a -lattice.
In general, multiplication of a -lattice by the complex scalar has much the same effect as a transformation of the corresponding real lattice by the rotation operator . The correspondence is not exact because includes a reflection as well as rotation and scaling, so that , whereas . (We could have avoided this difficulty by exchanging columns in the definition of .) However, if -i.e., if implies , where is the complex conjugate of -as will be true for all lattices to be considered here, then . The difference is slight, but we regard multiplication by the complex scalar as fundamentally a more natural operation than rotation by .
E. Binary Lattices
Binary lattices have proved to be the most useful class of lattices in applications. On the one hand, this is because they are a natural extension of binary block codes and are well suited to the bit-oriented real world. On the other hand, in many cases they give the best performance, both as lattices and as the basis of trellis codes, a result which would have been harder to predict a priori. For instance, the densest known lattices in , and 24 dimensions (among others) are binary lattices. We provide a brief introduction here; part II discusses binary lattices in more detail.
A real -dimensional lattice is a binary lattice if it is an integer lattice that has as a sublattice for some . The least such is called the 2-depth of the lattice. Thus is a partition chain. It turns out that all of the binary lattices that have proved to be useful to date have 2 -depth equal to one or two; we shall call such lattices mod-2 and mod-4 lattices, respectively.
A complex -dimensional lattice is a binary lattice if it is a Gaussian integer -lattice that has as a sublattice for some . The least such is called the -depth of the lattice. Thus is a partition chain.
If is a -dimensional real binary lattice, then the corresponding -dimensional complex lattice is also a complex binary lattice (if it is a -lattice), and vice versa, since . So we may speak of the -depth of a real -dimensional binary lattice. A real -dimensional binary lattice with 2 -depth has -depth or ; thus the -depth is twice as fine-grained a parameter, and we shall henceforth call it simply the depth of a binary lattice. A mod-2 binary lattice thus has depth 1 or 2 , and a mod-4 binary lattice has depth 3 or 4 . For example, since is a partition chain, is a mod-2 binary lattice with depth .
Since the order of the partition (resp. ) is a power of two, the orders of and (resp. and ) must be powers of two, since their product is (resp. ). The redundancy of a binary lattice is defined as the binary logarithm of , so that . In view of the corollary to Lemma 1, the fundamental volume of a binary lattice is therefore , and the fundamental coding gain is
where is the normalized redundancy (per two dimensions) of , where is the dimension of as a real lattice, or is the dimension of as a complex lattice.
If we choose a constellation of (say) points from , they will occupy a volume in -space approximately times as large as the same number of points chosen from would. We therefore say that the constellation expansion factor is in dimensions, or in two dimensions. As previously discussed, this translates into a power cost due to constellation expansion of a factor of , or . The formula for fundamental coding gain just given therefore has an interpretation as follows: the minimum squared distance gain of a factor of (relative to ) is partially offset by a constellation expansion power cost of a factor of , leaving a net coding gain of .
The order of is , and the order of is . We may give the somewhat ugly but dual name of the 'informativity' of , where is the dimension of as a complex lattice; the normalized informativity (per two dimensions) of is , where is the dimension of as a complex lattice, or is the dimension of as a real lattice. Thus the depth of a binary lattice is the sum of its normalized redundancy and informativity:
For example, the depth of is 1 , its redundancy and informativity are both equal to 1 , and its normalized redundancy and informativity are both equal to .
Moreover, if a sublattice of a binary lattice is also a binary lattice, then is a partition whose order is a power of two, since is a partition chain for some . We define the depth of a partition of binary lattices as the depth of ; the redundancy as the redundancy of ; and the informativity as the informativity of . It follows that the order of the partition is
where is the dimension of as a real lattice.
F. Labelings of Partitions of Binary Lattices
If and are binary lattices such that is a sublattice of , then the partition has order for some integer . Any map from binary -tuples to unique cosets of in this partition is called a labeling. The labeling may be defined by a function , where is any binary -tuple, called a label, and is a coset representative of the coset of specified by . We always assume that , i.e., that the zero label maps to the zero coset, namely, itself.
The following lemma shows, first, that any such partition can be broken up into a chain of two-way partitions ; second, that there is then a labeling
where is the th coordinate of the binary -tuple and is an element of but not of such that the two vectors , are a system of coset representatives for the cosets of in the two-way partition .
Thus the binary linear combinations of the generators are a system of coset representatives for the partition (this is a special case of a general result for groups of order binary groups-discussed in part II).
In the coding context, we call such a labeling an Ungerboeck labeling.
Lemma 2: Let and be binary lattices such that is a sublattice of , and let . Then there is a sequence of lattices such that is a lattice partition chain and each partition is two-way, has the coset decomposition
where and is a system of coset representatives for .
Sketch of proof (by induction, from down to ): Assume that is a binary lattice such that is a partition chain; is certainly such a lattice. If , then a vector exists in that is not in but has order , i.e., such that (see part II). Let then be the union of and its coset is clearly a lattice, which by construction is a sublattice of and has (and thus ) as a sublattice. The two vectors are a system of coset representatives for the cosets of in the two-way partition . By induction, the binary linear combinations are a system of coset representatives for . The induction terminates when .
Fig. 6 portrays an Ungerboeck labeling in two ways: as a partition tower, as in Fig. 5, and as a binary partition tree. In the tower, the first bit in the label selects one
Fig. 6. Illustration of Ungerboeck labeling by partition tower and by partition tree.
of the two cosets of in the partition , the second bit selects one of the two cosets of in the partition , and so forth. In the tree, these are shown as two-way branches. Thus an Ungerboeck labeling is nested, in the sense that the first bits of the labeling of the partition are a labeling of the partition . If is a vector in in the coset of whose label is , then is in a coset of determined by the first bits of .
Consequently, we have the lattice/coset version of the Ungerboeck distance bound: if and are two different points of that lie in cosets of whose labels agree in their first bits, then , because both and are in the same coset of (this is a special case of the partition distance lemma of part II).
G. Decomposition of Binary Lattices
This section shows that the structure of binary lattices (whether real or complex) is a generalization of that of binary block codes. Mod-2 binary lattices are essentially isomorphic to linear binary block codes, as we can see in the following lemma and proof (this is "Construction A" of Leech and Sloane [6]).
Recall that an ( ) linear binary block code is any -dimensional subspace of the -dimensional linear vector space over of all binary -tuples, where the coordinates of each of the codewords are regarded as elements of the binary field . The codewords may be expressed as linear combinations of a set of binary -tuples , called generators. The set is called the generator matrix , so . The minimum Hamming distance of the code is the minimum number of nonzero coordinates in any nonzero codeword; we shall sometimes call an ( ) code with minimum Hamming distance an ( ) code. (Unless otherwise specified, an code will always mean a linear binary ( ) code.)
Lemma 3: An -dimensional real lattice is a mod-2 binary lattice if and only if it is the set of all integer -tuples that are congruent modulo 2 to one of the codewords in a linear binary ( ) block code . The redundancy of is , and its minimum squared distance is , where is the minimum Hamming distance of the code .
Proof: If is a mod- 2 binary lattice, then is a partition chain, has order for some integer , and then has order ; so the redundancy of is is thus the union of cosets of (in the partition ); any such coset of has a binary coset representative , i.e., an tuple of ones and zeros, and the coset consists of all integer -tuples congruent to modulo 2 . By Lemma 2, , where the can be taken as binary -tuples, , and addition may be taken modulo ; i.e., modulo 2 . The set of coset representatives is thus a linear binary ( ) block code. Conversely, if is such a code, then the set of all integer -tuples that are congruent modulo 2 to code- words in is easily seen to be a lattice, with as a sublattice (the set of all integer -tuples congruent to modulo 2). If , the minimum norm in the coset is the norm of itself, which is also the Hamming weight of . The minimum such weight is the minimum Hamming distance of the code, . If , the minimum nonzero norm in itself is four. Thus .
Lemma 3 says that any mod-2 lattice may be constructed as a coset code , where is a linear binary ( ) block code. We can express by the code formula (coset decomposition) . This explicitly exhibits as the union of cosets of . The labeling is linear in the sense that . This picture of a mod-2 lattice as a coset code is illustrated in Fig. 7.
Fig. 7. Illustration of mod- 2 binary lattice as union of cosets of , each coset corresponding to codeword in linear binary code .
For example, the mod-2 lattices , and correspond to the , and binary codes, respectively, and consequently have minimum squared distances 1,2 , and 4 . The mod-2 lattices , and correspond to , and binary codes, with distances , and 4.
In general, the lattice corresponding to any single-par-ity-check ( ) code is a mod-2 lattice, the so-called "checkerboard lattice" , with minimum squared distance 2 and redundancy 1 , a sublattice of of order 2. The lattice is the lattice .
Mod-2 lattices are limited to minimum squared distances of four or less because -tuples in such as are lattice points. Binary block codes with Hamming distance 4 are therefore of special interest. The Gosset lattice is the mod-2 lattice corresponding to the Reed-Muller code; thus has , , and thus ( 3.01 dB ). ( has many remarkable properties; it is perhaps the second most important lattice in lattice theory.)
If is a sublattice of , where both are mod- 2 lattices, it is easy to see that the code associated with must be a subcode of the code associated with .
In general, if is a mod-4 lattice, then is a sublattice, and Lemma 2 allows us to write
where each may be taken as a coset representative of , i.e., as an -tuple of integers modulo 4 , but where the label is still a -valued integer -tuple. Addition may be taken modulo 4 . This exhibits as a union of cosets of , where the coset representatives are . If we take the coordinates of to be from the set , then is also a coset leader of its coset, i.e., an element of minimum norm.
For a further refinement, let be the set of all points in whose coordinates are all even. Then is a lattice, a sublattice of , with as a sublattice, so is a partition chain, and there is a coset decomposition of the form . The lattice is clearly a mod-2 lattice scaled by a factor of 2 ; consequently, , where is a binary ( ) code for some , by Lemma 3; in other words, the coset representatives may be taken as , or , where the constitute a set of binary generators for the code . Thus we may take of the generators to be , and we may write
where the , are -tuples of integers modulo 4 that are not all even, such that is a system of coset representatives . The generators generate a lattice that is a sublattice of , whose elements are congruent to modulo 4, where is a codeword in a binary ( ) block code that is a subcode of the code , so .
It is sometimes possible to find a set of generators for such that each is an -tuple of ones and zeros and such that the "carries" (the twos-coefficients in the vector sum) of any sum are a codeword in ; then we say that is decomposable. The coset decomposition then becomes the code formula , where is a subcode of . This means that consists of all integer -tuples whose coordinate ones-coefficients in the standard binary representation form a binary -tuple that is a codeword in , and whose coordinate twos-coefficients form a binary tuple that is a codeword in (this is the "coordinate array" idea of Leech and Sloane [6]).
The minimum squared distance of a mod-4 decomposable real lattice is
because, on the one hand, there are -tuples in of norm 16, in of norm and in of norm ; on the other hand, if and , then ; if but , then ; finally, if , then (if ). This suggests that we shall want to choose codes and for which the Hamming distances are in the ratio 1:4.
Most of the mod-4 binary lattices useful in practice are decomposable. For example, the lattice is a version of the Gosset lattice which is mod-4 and decomposable, with the code formula . The standard binary representation of an integer modulo 4 is a more elementary example of a decomposition of this type, with the code formula ; this simply means that every integer modulo 4 can be expressed as a binary linear combination of the two generators 2 and 1 , where the combination can be specified by a 2-bit label.
The simplest example of an indecomposable mod-4 lattice is the two-dimensional lattice , for which is the code with generator , so has generator , but must be taken as or ; however, this lattice can be made decomposable by inverting the sign of one coordinate. The Leech lattice is an example of a fundamentally indecomposable mod-4 lattice.
Fig. 8(a) illustrates the coset decomposition of a mod-4 binary lattice that is used above. The -bit label specifies a coset of in the partition , and the ( )-bit label specifies a coset of in the partition . Altogether, therefore, the -bit label ( ) specifies a coset of in the partition . Fig. 8(b) illustrates the same decomposition when is a decomposable mod-4 binary lattice with code formula ; the form is then that of a coset code . (If is the ( ) code, then is actually a mod-2 lattice, and Fig. 8 reduces to Fig. 7.)
Fig. 8. Illustration of mod-4 binary lattice as union of cosets of , each coset corresponding to binary label ( ). (a) General case. (b) Case where is decomposable.
It should be clear how to extend Fig. 8 to real binary lattices of any 2-depth , using a partition chain of sublattices of consisting of all elements of whose coordinates are in , and so forth. The decomposition of complex binary lattices is similar. Mod-2 complex binary -lattices are always decomposable, with a code formula of the form , as we can see from the following lemma. Complex binary lattices with depths of three or more are not necessarily decomposable, but the ones that we are concerned with generally are.
Lemma 4: An -dimensional complex -lattice is a mod-2 binary lattice if and only if it is the set of all Gaussian integer -tuples that are congruent modulo to an -tuple of the form , where is a codeword in a binary ( ) code , and is a codeword in a binary ( ) code which is a subcode of . The redundancy of is , and its minimum squared distance is
Proof: If is a mod-2 binary lattice, then is a partition chain, and has order for some integer . Since has order , has order , so the redundancy of is is the union of cosets of , and we may take the coordinates of the coset representative of any such coset to be of the form , where and are binary -tuples. Sums of coset representatives may be taken (or modulo 2 ).
Consider the set of all points in whose coordinates are all multiples of . Then is a partition chain, and there is a coset decomposition of the form . The generators of the lattice modulo may be taken as , for some , where each is a binary -tuple; consequently, , where is the binary ( ) code generated by these . Similarly, the generators of the lattice modulo may be taken as , where each is a binary -tuple; consequently, , where is the binary ( ) code generated by the . Since is a -lattice, implies . Therefore, must be a subcode of .
The minimum distance expression arises from the fact that there are -tuples of norm 4 in , of norm in , and of norm in ; conversely, if and , then ; if but , then ; finally, if , then (if ).
For the converse, let be the set of all Gaussian integer -tuples . To show that is a lattice, we must show that if , then , and also if , then and . The first two propositions follow immediately from . The third follows from , where the fact that depends on being a subcode of is a Gaussian integer lattice with as a sublattice (by construction).
A mod-2 complex -lattice has depth 1 if and only it the code of Lemma 4 is the ( ) code; then its code formula can be simplified to . Otherwise, it has depth 2.
For example, as a complex lattice, the mod-2 depth-1 Schläfli lattice is decomposable with the code formula
Thus , and 2, in agreement with the values obtained earlier for as a real lattice. As a complex lattice, the mod-2 depth-2 Gosset lattice is decomposable with the code formula
(see part II). Thus , and , in agreement with the values obtained earlier for as a real lattice. The complex binary representation of a Gaussian integer is a more elementary example of a decomposition of this type.
As in the real case, complex binary lattices that are not mod-2 are not necessarily decomposable, i.e., expressible purely in terms of binary codes. However, the lattices that are useful in applications generally are as follows. Let be a partition chain of binary ( ) codes, , i.e., is a subcode of . Then let be a complex binary lattice whose elements are the set of Gaussian integer -tuples that are congruent to modulo , where is a codeword in the code , i.e., the coefficients of in the complex binary representation of are codewords in . We represent this by the (complex) code formula
This is the coordinate array idea again, but using the complex binary representation rather than the standard binary representation.
Some of the principal properties of such a complex decomposable lattice are as follows. a) is a -lattice, because if , then , in view of the subcode structure of the code formula; thus . b) has depth (assuming that ). c) The order of the partition is the product of the ; the informativity of is ; the redundancy of is ; and the normalized redundancy of is . d) The minimum squared distance of is
because, on the one hand, we can exhibit -tuples with all of these norms by appropriate choice of codewords; on the other hand, if is the smallest index such that in the above congruence, then has at least coordinates with norm at least . This suggests that we shall want to choose a code partition chain for which the Hamming distances are in the ratio .
A decomposable complex lattice of depth may be depicted as a coset code , ), analogously to Fig. 8(b), where the coset of in the partition is determined by a codeword from .
H. Dual Lattices
If is a binary lattice of depth , either -dimensional real or -dimensional complex, we define its dual lattice as the set of all Gaussian integer -tuples that are orthogonal to all vectors modulo , where we use the complex inner product . That is, is a Gaussian integer in the lattice , or a multiple of . For example, it is easy to verify that is self-dual, using the fact that is the set of all pairs of Gaussian integers that are both even (in ) or both odd (in ).
If is a binary lattice of depth , we may regard and as dual partition chains, since is the lattice of all vectors orthogonal to all vectors in modulo , and vice versa. It is straightforward to verify that the order of is the same as the order of , which implies that the order of is the same as the order of . Therefore, the redundancy of is the informativity of , and vice versa. Moreover, if is a partition of binary lattices of depth , then and are dual partition chains, and is a partition of binary lattices with the same order and depth as .
If is a decomposable -dimensional complex binary lattice of depth , with code formula
then its dual lattice is a decomposable complex binary lattice of depth , with code formula
where is the dual code to . (Recall that the dual code to a linear binary ( ) block code is an ( ) code and that, if is a subcode of , then is a subcode of .) This proposition may be verified by noting that is a code partition chain, that every generator of is orthogonal to every generator of , and that the dimensions are such that the informativity of is equal to the redundancy of and vice versa.
For example, the Schläfli lattice is self-dual, because its complex code formula is , and the ( ) code is self-dual. The Gosset lattice is self-dual, because its complex code formula is , and the and codes are duals. An alternative definition of the dual of an -dimensional real lattice with 2-depth is the lattice consisting of all integer -tuples orthogonal to all tuples in modulo , using the real inner product. For the lattices that we will be considering, this definition coincides with the definition given above when the depth ( -depth) of is even, so that ; when is odd, , the dual lattice under this definition will be a version of the dual lattice as defined earlier, rotated by the rotation operator . When a real lattice is decomposable, the code formula for its dual has the same relation to its code formula as is given in the complex case above. For example, the Gosset lattice has depth 2 and has real code formula ; its dual under this alternative definition is still itself, because the code is self-dual. The Schläfli lattice has depth 1 and has real code formula ; its dual under this alternative definition is , which has code formula , , since the and codes are duals.
III. Useful Lattices and Their Partitions
In this section we list the lattices that have proved useful in applications. These are primarily the sequence of Barnes-Wall lattices and their principal sublattices, all of which are closely interrelated. Another close relative is the Leech lattice, probably the most important lattice in lattice theory. We give the principal properties of these lattices, including their geometrical parameters and their partition properties. For further details, see part II.
The Barnes-Wall lattices are a family of -dimensional binary lattices. The th member of the family, which we denote by , may be regarded as either a -dimensional complex -lattice or a -dimensional real lattice. The first few members of the family are (the Gaussian integer lattice or the two-dimensional real integer lattice), (the Schläfli lattice), (the Gosset lattice), (the 16-dimensional Barnes-Wall lattice), and (the 32dimensional Barnes-Wall lattice). These are the densest lattices known in 4,8 , and 16 dimensions (and, until recently, 32) [7].
The Barnes- Wall lattices are decomposable, with code formulas that involve the family of Reed-Muller codes. Recall that the Reed-Muller code , is a code of length , minimum distance , and with information bits, where is the combinatorial coefficient ; further, the Reed-Muller codes of a given length are nested, in the sense that is a code partition chain.
The Barnes-Wall lattice has depth equal to and may be defined as the complex -lattice that has the code formula
where . For example, as we have already seen, , and so forth. Thus the complex code formula involves all Reed-Muller codes of length (this construction is due to Cusack [19]). There are similar real code formulas involving alternate Reed-Muller codes of length , as we shall tabulate below.
From the properties of decomposable binary lattices and Reed-Muller codes, the redundancy and informativity of are both equal to , the normalized redundancy and informativity are both equal to , and the minimum squared distance is equal to . Consequently, the fundamental coding gain is . Because and are dual codes, the dual of has the same code formula as itself, and is self-dual for any .
The principal sublattices of the Barnes-Wall lattices are a family of lattices , which may be defined as decomposable -dimensional complex lattices of depth with the code formulas
where . Thus is , or the integer lattice , and is the Barnes-Wall lattice as previously defined. The lattice , is the "checkerboard lattice" , with code formula , where and ( ) is the single-parity-check code of length .
In view of the general formula for the dual of a decomposable complex -lattice and the duality properties of Reed-Muller codes, the duals of the principal sublattices are the decomposable -dimensional complex -lattices
TABLE I Useful Binary Lattices
| ( ) | Λ | Real Code Formula | Complex Code Formula | ||
|---|---|---|---|---|---|
| 2 | 0 | G | |||
| 4 | 0 | ||||
| 4 | 1 | ||||
| 8 | 0 | ||||
| 8 | 1 | ||||
| 8 | 2 | ||||
| 16 | 0 | ||||
| 16 | 1 | ||||
| 16 | 2 | ||||
| 16 | 3 | ||||
| 32 | 0 | ||||
| 32 | 1 | ||||
| 32 | 2 | ||||
| 32 | 3 | + (16, 5, 8) | |||
| 32 | 4 |
| | - | | 24 | 0 | | | | - | | 24 | 1 | | | | - | | 24 | 2 | | | | - | | 24 | 3 | | | | | | | | | | | - | | 24 | 4 | |
|
of depth with the code formulas
where .
Thus , and , is the dual of the checkerboard lattice , with code formula , where and is the repetition code of length .
Table I gives the real and complex code formulas of the Barnes-Wall lattices and their principal sublattices for up to 32 real dimensions ( 16 complex dimensions). In addition to the designations already introduced, we designate as as , and as . For reference, we also give "code formulas" for the Leech lattice and its principal sublattices , and , which are closely related to their 32 -dimensional relatives. ( and its principal sublattice are indecomposable complex binary lattices of depths 4 and 3, respectively; in these two cases a notation such as means the set of all binary linear combinations modulo 4 of a set of six generators whose coordinates are integers modulo 4 , such that the minimum nonzero norm in any coset with such a representative is 16 .)
Table II gives additional information on these lattices. The informativity and the normalized informativity follow from the dimensionality of the Reed-Muller codes in the complex code formula, and from (or from the code formulas) we can compute the redundancy and the normalized redundancy . For all of these lattices, the minimum squared distance is equal to ; it follows that the fundamental coding gain is given by . (This expression for the fundamental coding gain has the following interpretation: both and have the same minimum squared distance , but is the union of cosets of and is therefore times as dense as in -space. A constellation based on will therefore be a factor of smaller in dimensions, or in two dimensions, which translates into power savings of a factor of .) We also give the normalized error coefficient (normalized to two dimensions), which may be obtained from the lattice literature or from the weight distributions of Reed-Muller codes. Finally, we give the number of states in the trellis diagrams for that are derived in Part II, as well as the corresponding normalized decoding complexity , per two dimensions, where is the number of binary operations (additions or comparisons of two numbers) required to decode (find the closest element of to an arbitrary point in -space), using the trellis-based algorithms of Part II. (Note that the constellation expansion factor is almost a factor of two smaller for than for , while the fundamental coding gain is only slightly inferior (by a factor of ). The error coefficient and decoding complexity are also about a factor of two smaller. Therefore, the lattices may be attractive alternatives to the densest lattices in practical applications. In addition,
TABLE II Useful Binary Lattices
| A | dB | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 1 | 1 | 0.00 | 4 | 1 | 1 | |
| 0 | 0 | 0 | 0 | 1 | 1 | 0.00 | 4 | 1 | 1 | |
| 1 | 1/2 | 1 | 1/2 | 2 | 1.51 | 12 | 2 | 3.5 | ||
| 0 | 0 | 0 | 0 | 1 | 1 | 0.00 | 4 | 1 | 1 | |
| 3 | 3/4 | 1 | 1/4 | 2 | 2.26 | 28 | 2 | 5.75 | ||
| 4 | 1 | 4 | 1 | 4 | 2 | 3.01 | 60 | 4 | 11.75 | |
| 0 | 0 | 0 | 0 | 1 | 1 | 0.00 | 4 | 1 | 1 | |
| 7 | 7/8 | 1 | 1/8 | 2 | 2.63 | 60 | 2 | |||
| 11 | 11/8 | 5 | 5/8 | 4 | 4.14 | 284 | 8 | ~32 | ||
| 12 | 3/2 | 12 | 3/2 | 8 | 4.52 | 540 | 16 | ~ 64 | ||
| 0 | 0 | 0 | 0 | 1 | 1 | 0.00 | 4 | 1 | 1 | |
| 15 | 15/16 | 1 | 1/16 | 2 | 2.82 | 124 | 2 | 7.5 | ||
| 26 | 13/8 | 6 | 3/8 | 4 | 4.89 | 1244 | 16 | ~76 | ||
| 31 | 31/16 | 17 | 17/16 | 8 | 5.83 | 5084 | 128 | ~792 | ||
| 32 | 2 | 32 | 2 | 16 | 4 | 6.02 | 9180 | 256 | ~1584 | |
| 0 | 0 | 0 | 0 | 1 | 1 | 0.00 | 4 | 1 | 1 | |
| 11 | 11/12 | 1 | 1/12 | 2 | 2.76 | 92 | 2 | 7.25 | ||
| 18 | 3/2 | 6 | 1/2 | 4 | 4.52 | 508 | 8 | ~42 | ||
| 23 | 23/12 | 13 | 13/12 | 8 | 5.77 | 8188 | 128 | ~632 | ||
| 24 | 2 | 24 | 2 | 16 | 4 | 6.02 | 16380 | 256 | ~1264 |
the Leech half-lattice is decomposable as a real lattice, with the ( ) Golay code appearing in the code formula, whereas the Leech lattice itself is not.)
Because of the nested character of Reed-Muller codes, the code formulas show that is a partition chain of -dimensional complex lattices of depths and with distances (for short). Also, we may verify that is a partition chain of -dimensional complex lattices of depths and with distances . Similarly, is a partition chain of -dimensional complex lattices of depths and with distances , and is a partition chain of -dimensional complex lattices of depths and with distances .
Fig. 9 is an illustration of these partition chains in dimensions , and 16 , extended indefinitely using the lattices for all . The lattices are arranged in columns according to depth, where for the purposes of this diagram we regard as having depth for any . One unit of vertical distance corresponds to a two-way partition. The lines indicate sublattice relationships. From this diagram, we can easily find the least for which is a sublattice of any given lattice, and thus verify the depths of lattices and partition chains.
In the rest of this paper, we will be considering coset codes based on partitions of lattices that appear in Fig. 9. The partitions that we will use are generally those with at least as dense as , depths no greater than four, and orders no greater than . Table III summarizes some of the principal properties of such partitions, including: the (real) dimension ; the order
Fig. 9. Partition chains involving Barnes-Wall lattices, principal sublattices, and duals of principal sublattices in , and 16 dimensions.
of the partition; its depth , normalized informativity , and normalized redundancy ; the minimum squared distances of and ; and the normalized (per two dimensions) complexity of decoding the partition, where is the number of binary operations required by the trellis-based decoding algorithms of part II to determine the closest element of each of the cosets of in the partition to an arbitrary point in -space. (Note: if is a partition, then so is ; if it is simpler to decode than , the lesser is given.) The final column gives , to show that is approximated by , where is a small number in the range from one to six.
TABLE III Useful Lattice Partitions
| Λ | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2 | 1 | 0 | 0 | 1 | 2 | 2 | 1 | ||
| 2 | 4 | 2 | 0 | 0 | 1 | 4 | 4 | 1 | ||
| 2 | 8 | 3 | 0 | 0 | 1 | 8 | 8 | 1 | ||
| 2 | 16 | 4 | 0 | 0 | 1 | 16 | 16 | 1 | ||
| 4 | 2 | 1 | 1/2 | 0 | 1 | 2 | 5 | 2.5 | ||
| 4 | 4 | 2 | 1/2 | 1/2 | 2 | 4 | 10 | 2.5 | ||
| 4 | 8 | 2 | 1/2 | 0 | 1 | 4 | 16 | 2 | ||
| 4 | 16 | 3 | 1/2 | 1/2 | 2 | 8 | 32 | 2 | ||
| 4 | 32 | 3 | 1/2 | 0 | 1 | 8 | 56 | 1.75 | ||
| 4 | 64 | 4 | 1/2 | 1/2 | 2 | 16 | 112 | 1.75 | ||
| 4 | 128 | 4 | 1/2 | 0 | 1 | 16 | 208 | 1.6 | ||
| 8 | 2 | 1 | 3/4 | 0 | 1 | 2 | 6.5 | 3.25 | ||
| 8 | 8 | 2 | 1 | 1/4 | 2 | 4 | 30 | 3.75 | ||
| 8 | 16 | 2 | 1 | 0 | 1 | 4 | 44 | 2.75 | ||
| 8 | 16 | 3 | 1 | 1 | 4 | 8 | 60 | 3.75 | ||
| 8 | 32 | 3 | 1 | 3/4 | 2 | 8 | 88 | 2.75 |
| | | 8 | 128 | 3 | 1 | 1/4 | 2 | 8 | 280 | 2.2 | | | | 8 | 256 | 3 | 1 | 0 | 1 | 8 | 504 | | | | | 8 | 256 | 4 | 1 | 1 | 4 | 16 | 560 | 2.2 | | | | 8 | | 4 | 1 | 1/4 | 2 | 16 | 3792 | ~ 2 | | | | 8 | | 4 | 1 | 0 | 1 | 16 | 7376 | -2 | | | | 16 | 2 | 1 | 7/8 | 0 | 1 | 2 | 7.25 | 3.6 | | | | 16 | 16 | 2 | 11/8 | 1/8 | 2 | 4 | 74 | 4.6 | | | | 16 | 32 | 2 | 11/8 | 0 | 1 | 4 | 104 | 3.25 | | | | 16 | 128 | 3 | 3/2 | 5/8 | 4 | 8 | 776 | | | | | 16 | | 3 | 3/2 | 1/8 | 2 | 8 | 8440 | | | | | 16 | | 3 | 3/2 | 0 | 1 | 8 | 16376 | ~4 | | | | 16 | 256 | 4 | 3/2 | 3/2 | 8 | 16 | 1552 | ~6 |
IV. Trellis Codes
A. Introduction to Trellis Codes
A trellis code is a coset code as shown in Fig. 1, where is a rate- convolutional code. In this paper will always be a binary convolutional code, and and binary lattices, generally mod- 2 or mod- 4 .
The codewords in a rate- convolutional code may be expressed as sequences ( ) of binary ( )-tuples , which serve as labels that select cosets of in the partition . The code sequences in a trellis code therefore consist of the sequences of elements of that are congruent to some coset representative sequence ( ) modulo , where ( ) is a codeword in .
For technical reasons, all sequences ( ) are assumed to have a definite starting time , although they may continue indefinitely. We may associate with any such sequence a formal power series in the delay operator ,
where may be any integer, positive or negative. Thus a coset code maps a label sequence to a coset representative sequence .
The important properties of a convolutional code are linearity and time-invariance. Linearity means that the mod-2 sum of any two codewords is a codeword. Timeinvariance means that the time shift of any codeword is a codeword, i.e., if is a codeword, then so is . (It follows that a convolutional code is a vector space over the field of binary formal power series ; the dimension of this vector space is , and any codeword can be written as , where the , are a set of generator sequences that form a generator matrix ; see [20].) We assume that the labeling function is time-invariant; then a trellis code is also time-invariant, although, as we shall see in more detail below, not necessarily linear.
A convolutional code has a well-defined state space, which is a vector space over the binary field of some dimension . The parameter is called the overall constraint length, or just constraint length, of . The code can be generated by a linear (binary) finite-state machine with inputs, outputs, and binary memory elements; such an encoder has states.
A trellis diagram for a -state, rate- convolutional code is an extended state transition diagram for the encoder that generates . For each time , it has nodes, or states, representing the possible states at time . For each possible state transition, it has a branch connecting the two corresponding nodes. There are branches leaving and entering each node, and each is labeled with the ( )-tuple that represents the encoder output associated with that state transition. Thus we may obtain a trellis diagram for a trellis code by taking a trellis diagram for and replacing each label by the corresponding coset representative , representing the coset .
The minimum Hamming distance of a convolutional code is the minimum Hamming distance between any two codewords in , i.e., the minimum number of coordinate differences between the outputs on any two paths in the trellis that start and end in a common state. Because is linear, this is also the minimum Hamming weight of any codeword, which is the minimum weight of any path that starts and ends on a zero state.
The minimum squared distance of a trellis code is the minimum squared distance between any two code sequences in , which is the lesser of a) the minimum distance between sequences and that correspond to two distinct paths that begin and end in a common state; and b) the minimum distance between elements of , corresponding to "parallel transitions" associated with any given branch. (If is a code sequence, then so is for any .)
For example, the four-state Ungerboeck code shown in Figs. 2 and 3 uses the four-state rate-1/2 convolutional code whose encoder and trellis diagram are illustrated in Fig. 10. Contrary to convention, the encoder is shown in coset code form, using the partition ( )/ of binary codes of length 2 . Let be a coset representative for the nonzero coset in the partition , e.g., , and let be the coset representative for the nonzero coset in the partition , i.e., .
The two bits select a 2 -tuple , representing one of the four cosets of (the single codeword [00]) in the four-way partition . In the trellis diagram, branches are labeled by both and .
Fig. 10. Convolutional encoder for four-state Ungerboeck code of Figs. 2 and 3, and trellis diagram labeled with both and .
The minimum Hamming distance of this code (taking the outputs as is five, because the distance between distinct paths is at least two where they diverge, two where they merge, and one somewhere in between. (This is because the difference between paths is a codeword of the ( ) code where they merge and diverge, and a codeword in the code somewhere in between; that is, we are exploiting the Ungerboeck distance bound for this code partition chain.)
The trellis code of Figs. 2 and 3 is obtained by replacing the code partition chain by the corresponding partition of lattices, . Note that and are still coset representatives for the nonzero cosets of in the partition and of in the partition , respectively, if we regard them as integers modulo 2. The trellis diagram of Fig. 10 then continues to represent this trellis code, where we now regard the 2tuples as coset representatives of cosets of .
It is easy to see that the minimum squared distance between code sequences corresponding to distinct paths in the trellis is the minimum Hamming distance between sequences of coset representatives and thus is equal to . However, since , the minimum squared distance of the trellis code is . If is any code sequence, the only code sequences at distance 4 from are the sequences , where is one of the four elements of of norm 4 , namely and . These are special cases of general results for trellis codes based on partitions of mod-2 lattices that will be given below.
B. Geometrical Parameters
As with lattices, the two principal geometrical parameters of a trellis code are the minimum squared distance between its code sequences and its fundamental volume (per dimensions); these determine its fundamental coding gain .
We have already introduced and noted that , where is the minimum squared distance between code sequences and that correspond to two distinct paths that begin and end in a common state in the code trellis. If, as with convolutional codes, the distribution of distances from a given code sequence to all other code sequences does not depend on the given code sequence, then is the minimum squared distance from the all-zero code sequence to any other code sequence, i.e., the minimum norm of any code sequence. We call such codes distance-invariant. All codes in this paper are distanceinvariant.
The error coefficient is the average number of code sequences that differ from a given sequence by and that first differ from the given sequence at a given time . By time-invariance, does not depend on the time , and we may take , say. If the code is distanceinvariant, then the number of code sequences that differ from a given sequence by does not depend on the given sequence, and we may take the given sequence as the all-zero sequence. Thus in a distance-invariant code is the number of sequences of norm that start at time zero.
The fundamental volume is a trickier concept. Intuitively, since the code conveys bits of information per unit time (per dimensions) and has bits of redundancy, it is clear that in some sense the trellis code is times as dense as and a factor of less dense than , per dimensions. Therefore, the fundamental volume should be equal to , or to .
To substantiate this proposition, we argue as follows. Define as the set of all code sequences that start at time or later. By time-invariance, all such sets are isomorphic to each other and to a particular such set, say . How- ever, is also a proper subset of if ; e.g., is a proper subset of .
Define two code sequences as equivalent modulo if their first difference is at time or later. Two sequences in are then equivalent modulo if and only if their first element is the same. Let then be the set of all possible first elements ; that is, }, where is a possible time-zero output from the encoder given that all previous outputs were zero or, equivalently, given that the encoder starts in the zero state. There are such , and thus is the union of cosets of . Ordinarily, is a lattice, which we call the time-zero lattice. By Lemma ; since , we also have .
In an appropriate sense, therefore, the equivalence classes of modulo , which we may write as , are isomorphic to the time-zero lattice . The set has the decomposition
which by time-invariance is isomorphic to the Cartesian product . In other words, fills space as densely as does . Thus it is reasonable to define the fundamental volume of per dimensions as .
Now we may define the fundamental coding gain of a trellis code in the same way as we did for a lattice:
Let us now write and for the parameters and of a convolutional code , and and for the normalized informativity and redundancy and , respectively. Since , we also have the expressions
where and . Also,
where . Thus if we define the normalized redundancy, informativity, and depth of the trellis code as the sums of the corresponding quantities for the code and partition , where we regard the depth of as 0 , then we get expressions analogous to those that we obtained for lattices.
The following lemma is both useful in itself and also gives an intuitive explanation of these formulas.
Lemma 5: If is a trellis code based on a partition of binary lattices, where the depth of is , and a -state, rate- convolutional code , then there is an equivalent trellis code based on the partition , where is a -state, rate- convolutional code, and is the dimension of or as complex lattices.
Proof: If is a binary lattice of depth , then is a partition chain, with , , and . In view of the coset decomposition , we may select a coset of in the partition by the following set of bits: an all-zero -tuple , which selects the zero coset of in the partition , namely, itself; a -tuple , which selects the coset of in the partition as in the original trellis code ; and finally, a -tuple of "uncoded bits" which selects one of the cosets of whose union is . These bits can be regarded as the outputs of an augmented convolutional encoder , which has information bits, redundant bits, and the same number of states as the original encoder for , as illustrated in Fig. 11. The set of code sequences that may be generated by this augmented encoder are the same as those in the original code .
Fig. 11. Augmented encoder of Lemma 5.
An alternative form of Lemma 5 is as follows. If the 2-depth of is , then is equivalent to a code based on the partition , where is a -state, rate- convolutional code, and is the dimension of or as real lattices. The proof is essentially the same. Indeed, if is even, , then the partition is the same as , and the augmented encoder is the same; if is odd, , then the partition is an extension of by the partition , and the augmented encoder just uses more uncoded bits.
The coding gain may now be related to that of the lattices , and as follows. Relative to , the gain is greater by a factor of due to the fact that conveys more bits of information per two dimensions, offset by a distance loss factor of (if any).
Relative to , the gain is greater by the distance gain factor of , offset by a power loss of due to constellation expansion.
If , then and , so is simply , offset by a distance loss factor of (if any). If , then and , so is simply , offset by a power loss of due to constellation expansion.
This last expression is the simplest and shows that we need to know only the minimum squared distance and the normalized redundancy to compute the fundamental coding gain , where the normalized redundancy is simply the sum of the normalized redundancies of the code and the lattice . If , then it suffices to know the normalized redundancy of the code . A small normalized redundancy is thus desirable to both minimize constellation expansion and maximize coding gain for a given , as was recognized by Wei [11].
Lemma 5 shows that all trellis codes based on partitions of binary lattices are equivalent to trellis codes based on partitions , or, by extension, partitions of the integer lattice , so that in principle only these kinds of partitions need to be considered to discover all binary trellis codes. In practice, consideration of more general partitions both facilitates the search for good codes and simplifies their encoding and decoding.
C. Linear Trellis Codes
In general, if and are two code sequences in a trellis code , it is not necessarily true that and are also code sequences. When this property does hold, we say that is a linear trellis code. In this section we give some examples of linear trellis codes, including the important case where is a partition of mod-2 binary lattices.
If a trellis code is linear, then it is a lattice, albeit an infinite-dimensional lattice. Because it is a time-invariant infinite-dimensional lattice, it is usually possible to define its parameters on a per-unit-time or per-two-dimensions basis, so that they are perfectly finite and analogous to the parameters of a finite-dimensional lattice, as we have already seen in the previous section. In this sense, linear trellis codes are to finite-dimensional lattices as convolutional codes are to block codes. (To extend this analogy, nonlinear trellis codes are a generalization of nonlattice finite-dimensional sphere packings.)
A trivial example of a linear trellis code is the repeated use of a lattice code. If is any lattice, let be defined as the set of all sequences ( ), where , . This may be regarded as a trellis code based on the "dummy partition" where the convolutional encoder disappears. The trellis associated with has one state for each time , and the branch in each time interval is labeled by .
If is a linear trellis code, it is a sublattice of and has as a sublattice, so is a partition chain. If the partition has depth , then is a partition chain; if it has 2 -depth , then is a partition chain.
A trellis code has little chance of being linear unless the mapping from encoder output ( )-tuples (labels) to coset representatives is linear modulo , as in Lemma 2; i.e., , where is a generator matrix of vectors of that span , modulo . The following lemma shows that when is a partition of mod-2 lattices and the labeling map is linear, is linear, and indeed isomorphic to a binary convolutional code, in the same sense as a mod-2 binary lattice is isomorphic to a binary block code, described in Lemma 3 (recall that if is an Ungerboeck labeling, it is linear modulo ).
Lemma 6: If is a mod-2 lattice, is a -state, rate- convolutional code and the labeling map is linear modulo , then a trellis code is the set of all sequences of integer -tuples that are congruent modulo 2 to one of the words in a -state rate- convolutional code .
The redundancy of is , and its minimum squared distance is .
Sketch of proof: By the extension of Lemma 5, is equivalent to a code based on the partition , where the augmented encoder has output bits and redundancy . In the augmented encoder, the ( )-tuple and the uncoded bits specify a coset of in the partition , which may be specified by a binary -tuple . The mapping may be taken to be linear if is linear modulo . Thus is set of all sequences of integer -tuples that are congruent modulo 2 to one of the words in the convolutional code . The minimum squared distance between code sequences corresponding to distinct codewords of is , and .
The four-state Ungerboeck code shown in Figs. 2, 3, and 10 is an example of a code of this type. The encoder of Fig. 10 is of the form of the augmented encoder of Fig. 11. Many of the important known codes to be listed in the next section, including Gallager-Calderbank-Sloane (GCS)-type codes and most of the Wei codes, are of this type.
Even when a code is not linear, it may still be regular in the sense of Calderbank and Sloane [13]. A labeling is defined as regular if the minimum squared distance between points in two cosets and is a function only of the mod-2 sum of their labels or, equivalently, if the minimum norm in the coset is equal to the minimum norm in the coset . For example, the 2-bit standard binary representation is a regular labeling of the four cosets in the partition . If is a code based on a partition with a regular labeling, the minimum squared distance between code sequences in the coset sequences and corresponding to codewords and is then equal to the minimum norm of any code sequence in the coset sequence ) corresponding to the codeword . Therefore, the distribution of distances from any given code sequence to all other code sequences is the same as the norm distribution of code sequences, and the code is thus distance-invariant.
A labeling is regular under any of the following conditions: a) if it is linear, in the sense that modulo , e.g., whenever and are mod- 2 lattices; b) if it is an Ungerboeck labeling and the Ungerboeck distance bound always holds with equality, e.g., any Ungerboeck labeling for ; c) if and are -fold Cartesian products and , and the labeling for is the -fold Cartesian product of a regular labeling for , e.g., when the partition is or .
In fact, regular labelings (although not necessarily regular Ungerboeck labelings) exist for all partitions used in all the codes covered in this paper.
V. Known Classes of Trellis Codes
We shall now categorize the principal classes of trellis codes that have so far appeared in the literature according to the parameters of the previous sections. In the next section, we give further generic classes. We shall then compare and contrast all of those schemes, including lattice codes.
Ungerboeck [8] developed classes of one- and twodimensional trellis codes using rate- binary convolutional codes. From the viewpoint of this paper, his one-dimensional schemes are based on the four-way partition of the integers into the four residue classes modulo 4 , in combination with a binary rate- convolutional coder to select cosets of . His two-dimensional schemes for rectangular constellations, which have the greatest practical importance, are based on either the fourway partition in combination with a rate-1/2 convolutional encoder, or the eight-way partition with a rate- convolutional encoder. He also gives codes using phase-modulated constellations that are based on similar principles and may be regarded as coset codes (see Section I-C), but that will not be covered here.
Table IV gives the characteristics of the Ungerboeck one- and two-dimensional schemes. The codes achieve increasing as the number of states increases from 4 to 512 , up to the maximum possible value of . (Note: We use the codes listed in [21], where minor corrections have been made in the earlier code tables.) The depth is for the one-dimensional schemes, while in two dimensions or 3 . The redundancy is one for both classes, but the normalized redundancy (per two dimensions) is thus two in the one-dimensional case, versus one in the two-dimensional case. The fundamental coding gain is given by the formula and is also given in decibels. is the number of nearest neighbors, and is the error coefficient normalized to two dimensions. is the number of decoding operations using the trellis-based decoding algorithms of the partition whose complexity is given in Table III, followed by a conventional Viterbi algorithm for the convolutional code, and is the decoding complexity per two dimensions. (For each unit of time, for each of the states, the Viterbi algorithm requires additions and a comparison of numbers, or binary comparisons, so that its complexity is , where , and is the number of branches per stage of the trellis, which is the measure of complexity used by Ungerboeck [21], following Wei [11].)
The error coefficient reduces the effective coding gain by an amount that depends on the steepness of the error probability curve. In this paper, we will use the rule of thumb that every factor of two increase in the error coefficient reduces the coding gain by about 0.2 dB (at error rates of the order of ); this will enable us to compute an effective coding gain (in dB), normalized for the error coefficient . In principle, the error coefficients at every distance ought to be considered, and the effective coding gain evaluated in the same way for each; if
TABLE IV Ungerboeck Codes
| Λ | dB | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| One-dimensional | ||||||||||
| 1 | 4 | 1/2 | 2 | 9 | 2.25 | 3.52 | 8 | 24 | ||
| 1 | 8 | 1/2 | 2 | 10 | 2.5 | 3.98 | 8 | 48 | ||
| 1 | 16 | 1/2 | 2 | 11 | 2.75 | 4.39 | 16 | 96 | ||
| 1 | 32 | 1/2 | 2 | 13 | 3.25 | 5.12 | 24 | 192 | ||
| 1 | 64 | 1/2 | 2 | 14 | 3.5 | 5.44 | 72 | 384 | ||
| 1 | 128 | 1/2 | 2 | 16 | 4 | 6.02 | 132 | 768 | ||
| 1 | 256 | 1/2 | 2 | 16 | 4 | 6.02 | 4 | 1536 | ||
| 1 | 512 | 1/2 | 2 | 16 | 4 | 6.02 | 4 | 3072 | ||
| Two-dimensional | ||||||||||
| 2 | 4 | 1/2 | 1 | 4 | 2 | 3.01 | 4 | 16 | ||
| 2 | 8 | 2/3 | 1 | 5 | 2.5 | 3.98 | 16 | 64 | ||
| 2 | 16 | 2/3 | 1 | 6 | 3 | 4.77 | 56 | 120 | ||
| 2 | 32 | 2/3 | 1 | 6 | 3 | 4.77 | 16 | 232 | ||
| 2 | 64 | 2/3 | 1 | 7 | 3.5 | 5.44 | 56 | 456 | ||
| 2 | 128 | 2/3 | 1 | 8 | 4 | 6.02 | 344 | 902 | ||
| 2 | 256 | 2/3 | 1 | 8 | 4 | 6.02 | 44 | 1800 | ||
| 2 | 512 | 2/3 | 1 | 8 | 4 | 6.02 | 4 | 3592 |
they grow too large too rapidly, they can dominate performance. We will not go beyond the error coefficient in this paper, except for Ungerboeck-type codes, where we can present results of Eyuboglu and Li (unpublished) that take into account the next two normalized coefficients, and .
Honig [22] has performed a search for one-dimensional Ungerboeck-type codes based on the four-way partition , using a criterion that includes the effect of the error coefficient, and has obtained an improvement at 64 states (the apparently improved 16 -state code is actually catastrophic). Similarly, Pottie and Taylor [23] have searched for two-dimensional Ungerboeck-type codes based on the eight-way partition and have obtained improvements at 64 and 128 states; the 128-state code has a lower fundamental coding gain but a greater effective coding gain due to its much lower error coefficient. Finally, Eyuboglu and Li have made a reasonably exhaustive search for the best codes of both classes in terms of the effective coding gain criterion, for up to 256 states, with modest improvements at as few as 16 states.
Table V gives the performance parameters , and the consequent effective coding gains for a number of the codes of Ungerboeck, Honig, Pottie and Taylor, and Eyuboglu and Li . When the dominant error coefficient is other than , it is starred. The parity-check polynomials for these codes are also given, in the octal notation of Ungerboeck [8], [21]. All parameters not given (including decoding complexity) are the same as for the Ungerboeck code with the same number of states.
Fig. 12(a) plots the effective coding gain versus the normalized complexity for the best of these Unger-boeck-type one- and two-dimensional codes. We see that the graphs are fairly linear on this log-log plot over most of their range. An increase of a factor of two in complexity yields an increase of about 0.4 dB in coding gain, until the effective coding gain passes 5 dB . The one-dimensional codes are of the order of 0.2 dB better over this linear range (however, the two-dimensional codes have been generally preferred in practice because their constellation expansion factor is only two, not four).
The first multidimensional code seems to have been developed by Gallager [1]. In this code, an eight-state rate3/4 convolutional encoder selects two successive cosets from the four-way two-dimensional partition or, equivalently, one coset from the 16-way fourdimensional lattice partition (equivalently, four successive cosets from the two-way one-dimensional partition ). The basic idea, as in Lemma 6, is that with such partitions the minimum squared distance between code sequences is simply the minimum Hamming distance of the binary code, as long as . Quite independently, Calderbank and Sloane [10] discovered a very similar code, although with improved error coefficient due to the choice of an eight-state rate- binary code with a lower error coefficient. We shall call codes based
TABLE V Effective Coding Gains
| dB | Type | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| For codes | |||||||||||
| 4 | 2 | 5 | 9 | 2.25 | 3.52 | 8 | 16 | 32 | 3.32 | U | |
| 8 | 04 | 13 | 10 | 2.5 | 3.98 | 8 | 16 | 32 | 3.78 | U | |
| 16 | 04 | 23 | 11 | 2.75 | 4.39 | 16 | 16 | 32 | 3.99 | U | |
| 16 | 10 | 23 | 11 | 2.75 | 4.39 | 8 | 16 | 48 | 4.19 | EL | |
| 32 | 10 | 45 | 13 | 3.25 | 5.12 | 24 | 56 | 112 | 4.60 | U | |
| 64 | 024 | 103 | 14 | 3.5 | 5.44 | 72 | 0 | 180 | 4.61 | U | |
| 64 | 054 | 161 | 14 | 3.5 | 5.44 | 16 | 132 | 4.94 | H | ||
| 128 | 126 | 235 | 16 | 4 | 6.02 | 132 | 0 | 512 | 5.01 | U | |
| 128 | 160 | 267 | 15 | 3.75 | 5.74 | 16 | 68 | *200 | 5.16 | EL | |
| 128 | 124 | 207 | 14 | 3.5 | 5.44 | 8 | 16 | 28 | 5.24 | EL | |
| 256 | 362 | 515 | 16 | 4 | 6.02 | 4 | 64 | *160 | 5.47 | U | |
| 256 | 370 | 515 | 15 | 3.75 | 5.74 | 8 | 12 | *80 | 5.42 | EL | |
| 512 | 0342 | 1017 | 16 | 4 | 6.02 | 4 | 0 | *112 | 5.57 | U | |
| For codes | |||||||||||
| 4 | - | 2 | 5 | 4 | 2 | 3.01 | 4 | 32 | 128 | 3.01 | U |
| 8 | 04 | 02 | 11 | 5 | 2.5 | 3.98 | 16 | 72 | 320 | 3.58 | U |
| 16 | 16 | 04 | 23 | 6 | 3 | 4.77 | 56 | 160 | 820 | 4.01 | U |
| 32 | 10 | 06 | 41 | 6 | 3 | 4.77 | 16 | 104 | 404 | 4.37 | U |
| 32 | 34 | 16 | 45 | 6 | 3 | 4.77 | 8 | *128 | 404 | 4.44 | EL |
| 64 | 064 | 016 | 101 | 7 | 3.5 | 5.44 | 56 | 260 | 1008 | 4.68 | U |
| 64 | 060 | 004 | 143 | 7 | 3.5 | 5.44 | 48 | 292 | 1184 | 4.72 | PT |
| 64 | 036 | 052 | 115 | 7 | 3.5 | 5.44 | 40 | 252 | 992 | 4.78 | EL |
| 128 | 042 | 014 | 203 | 8 | 4 | 6.02 | 344 | 0 | 5900 | 4.74 | U |
| 128 | 056 | 150 | 223 | 8 | 4 | 6.02 | 172 | 624 | 2568 | 4.94 | EL |
| 128 | 024 | 100 | 245 | 7 | 3.5 | 5.44 | 8 | *188 | 968 | 4.91 | PT |
| 128 | 164 | 142 | 263 | 7 | 3.5 | 5.44 | 8 | *132 | 752 | 5.01 | EL |
| 256 | 304 | 056 | 401 | 8 | 4 | 6.02 | 44 | *304 | 1316 | 5.28 | U |
| 256 | 370 | 272 | 417 | 8 | 4 | 6.02 | 36 | *308 | 1224 | 5.28 | EL |
| 256 | 274 | 162 | 401 | 7 | 3.5 | 5.44 | 4 | *64 | 248 | 5.22 | EL |
| 512 | 0510 | 0346 | 1001 | 8 | 4 | 6.02 | 4 | 128 | *700 | 5.50 | U |
Fig. 12. Performance versus complexity. (a) For Ungerboeck-type onedimensional and two-dimensional codes (as improved by Eyuboglu and Li). (b) For Wei codes. (c) For Calderbank-Sloane-type codes.
on partitions (with ) GCS-type codes. Table VI gives the parameters of the GCS-type code just described, with the error coefficient for the Calder-bank-Sloane (CS) version (as given in [13]).
Wei [11] has developed a variety of multidimensional codes. We shall say that a code is "Wei-type" if is a lattice partition where is a denser lattice than . Wei stresses that should be dense to maximize coding gain and to simplify code construction but that should have low redundancy to maximize coding gain and to minimize the constellation expansion factor. Wei is also willing to increase the number of states (and decoding complexity) to minimize the error coefficient , even when the fundamental coding gain is not improved thereby. Table VII gives the parameters for the codes discussed in [11]; Wei also mentions that any of his codes that are based on the 16-way partition can be translated into a code based on the 16-way partition , with the same error coefficient, but with twice the constellation size. (Wei's lattice " " is here called , in keeping with the notation of Section III.) Note that the 32 -state code based on is a GCS-type code. Indeed, all of the codes in which is a lattice of depth 2 are equivalent to GCS-type codes in view of Lemma 6; in particular, the eight-state code based on is equivalent to the CS version of the GCS-type code just described, although its decoding complexity is less because of the symmetries of (the blank error coefficients are unknown but large).
Ungerboeck [21] has found additional 128 -state Wei-type codes that extend the four- and eight-dimensional families; these codes are listed in Table VIII. Fig. 12(b) plots the effective coding gain versus the normalized complexity for the Wei codes for which the error coefficient is known, versus the codes of Fig. 12(a) as benchmarks.
Calderbank and Sloane [12], [13] have also developed a number of multidimensional codes. We shall say that a
TABLE VI GCS-Type Code
| dB | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 8 | 4 | 4.52 | 44 | 64 | 3.82 |
TABLE VII Wei Codes
| Λ | dB | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 8 | 2/3 | 1/2 | 4 | 4.52 | 44 | 44 | 3.82 | |||
| 4 | 16 | 2/3 | 1/2 | 4 | 4.52 | 12 | 72 | 4.20 | |||
| 4 | 32 | 3/4 | 1/2 | 4 | 4.52 | 4 | 244 | 4.52 | |||
| 4 | 64 | 4/5 | 1/2 | 5 | 5.48 | 72 | 1048 | 4.65 | |||
| 8 | 16 | 3/4 | 1/4 | 4 | 5.27 | 316 | 104 | 4.01 | |||
| 8 | 32 | 3/4 | 1/4 | 4 | 5.27 | 124 | 164 | 4.28 | |||
| 8 | 64 | 3/4 | 1/4 | 4 | 5.27 | 60 | 284 | 4.49 | |||
| 16 | 32 | 4/5 | 1/8 | 4 | 5.64 | 228 | |||||
| 16 | 64 | 4/5 | 1/8 | 4 | 5.64 | 796 | 352 | 4.12 | |||
| 16 | 128 | 4/5 | 1/8 | 4 | 5.64 | 412 | 600 | 4.31 | |||
| 8 | 32 | 4/5 | 1 | 8 | 4 | 6.02 | 336 | ||||
| 8 | 64 | 4/5 | 1 | 8 | 4 | 6.02 | 316 | 584 | 4.76 | ||
| 8 | 128 | 4/5 | 1 | 8 | 4 | 6.02 | 124 | 1080 | 5.03 |
TABLE VIII Further Wei-Type Codes (Ungerboeck)
| dB | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 128 | 6 | 6.28 | 728 | 2040 | 4.77 | |||||
| 8 | 128 | 4 | 5.27 | 28 | 1032 | 4.71 |
TABLE IX Calderbank-Sloane Codes
| Λ | dB | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 4 | 2/4 | 2 | 8 | 2 | 3.01 | 4 | 44 | 3.01 | ||
| 2 | 8 | 2/4 | 2 | 11 | 2.75 | 4.39 | 32 | 72 | 3.79 | ||
| 2 | 16 | 2/4 | 2 | 12 | 3 | 4.77 | 48 | 128 | 4.05 | ||
| 2 | 64 | 2/4 | 2 | 14 | 3.5 | 5.44 | 48 | 464 | 4.72 | ||
| 2 | 128 | 2/4 | 2 | 16 | 4 | 6.02 | 228 | 912 | 4.85 | ||
| 4 | 16 | 3/4 | 1 | 6 | 3 | 4.77 | 152 | 152 | 3.72 | ||
| 4 | 64 | 3/4 | 1 | 8 | 4 | 6.02 | 828 | 512 | 4.48 | ||
| 8 | 8 | 3/4 | 5/4 | 8 | 5.27 | 764 | 90 | 3.75 | |||
| 8 | 16 | 3/4 | 5/4 | 8 | 5.27 | 316 | 120 | 4.01 | |||
| 8 | 32 | 3/4 | 5/4 | 8 | 5.27 | 124 | 180 | 4.28 | |||
| 8 | 64 | 3/4 | 5/4 | 8 | 5.27 | 60 | 300 | 4.49 |
Note added in proof: J. Chow (private communication) has obtained values of and for the 16-state code, and of , and for the 64-state code.
TABLE X Further Codes (Eyuboglu)
| Λ | dB | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 4 | 2/4 | 2 | 9 | 2.25 | 3.52 | 8 | 44 | 3.32 | ||
| 2 | 16 | 2/4 | 2 | 12 | 3 | 4.77 | 16 | 128 | 4.37 | ||
| 2 | 32 | 2/4 | 2 | 12 | 3 | 4.77 | 4 | 240 | 4.77 | ||
| 2 | 32 | 2/4 | 2 | 13 | 3.25 | 5.12 | 16 | 240 | 4.72 |
code is "CS-type" if is a lattice partition where and are versions of the same lattice. Some of their codes are shown in Table IX. (
They also consider the following: Ungerboeck-type codes based on partitions and , but without improvement over Ungerboeck either in fundamental coding gain or in error coefficient , except for the case also found by Pottie and Taylor; the GCS-type code based on the partition , as previously mentioned; and codes using the nonbinary two-dimensional hexagonal lattice , for which the results are not particularly encouraging. The last three codes appear to be equivalent to the aforementioned translation of Wei's codes.
)
Finally, Eyuboglu has also searched for codes based on the 16-way two-dimensional partition . The additional codes found that improve on codes already listed are summarized in Table X.
Fig. 12(c) plots the effective coding gain versus the normalized complexity for the Calderbank-Sloane codes, as improved by the codes in Table X (up to 32 states), again with the codes of Fig. 12(a) for comparison. Note that the codes ought to be compared to the one-dimensional codes, since they have the same redundancy and depth, and in fact include the latter as a subset; their performance improvement is in fact small, and, taking complexity into account, they are no better.
VI. Further Classes of Trellis Codes
In this section we present a number of additional generic classes of codes that can be described relatively simply. Our objective is more to round out the picture than to improve on earlier results; in general, these codes have parameters comparable to those of the known codes of the previous section (or indeed of lattice codes). Our main point, in fact, is that "there are many ways to modulate," and that the complexity of the encoder and decoder required to achieve a given coding gain and error coefficient remains remarkably constant across a wide variety of codes.
We describe eight different classes of codes , based on all possible choices of the three following binary characteristics. The codes are based either on a lattice partition with minimum squared distances in the ratio , or on a partition chain with distances in the ratio (in the latter case, we use an Ungerboeck labeling and exploit the Ungerboeck distance bound). The convolutional code is either a rate- code, with or else , or a rate- code, with or else and . The constraint length of is either or , with each of the input bits being held in memory for either one or two time units. All codes are noncatastrophic, i.e., there is no infinite sequence of nonzero inputs that leads to a finite sequence of nonzero outputs.
The resulting codes have many characteristics similar to those of the Wei and Calderbank-Sloane codes, as well as the four-state two-dimensional Ungerboeck code. Except for error coefficient, the rate- codes also have parameters resembling those of the Barnes-Wall lattices.
Class I Codes
Let be a -way lattice partition with . Let be a "unit-memory" rate- binary convolutional encoder, as shown in Fig. 13(a). In each time
Fig. 13. Encoders. (a) For Class I code. (b) For Class II codes. (c) For Class III and IV codes. (d) For Class V codes. (e) For Class VI codes. (f) For Class VII and VIII codes.
unit information bits enter the encoder and are stored for one time unit; the encoder output is the combination of the current and previous bits, or bits altogether, which select one of the cosets of . The encoder has states, and the code has a trellis diagram in which every current state is connected to every next state, so there are a total of branches in each time unit, one corresponding to each coset of . The code is thus not catastrophic, because only one branch corresponds to the zero coset of ( itself). Any two paths through the trellis must differ in at least two time units, so the minimum squared distance between paths is , which is the same as the minimum squared distance within any coset of ; thus .
The multiplicity of sequences at distance from a given sequence that start at a given time is
where is the number of points of weight in , and is the number of points of weight in any nonzero coset of (if it is the same for all such cosets).
Table XI gives the parameters for Class I codes based on the partitions , and , which have orders , and 256 , and depths , and 4 , respectively (the code was developed as a phase-modulated code by Divsalar and Simon [24]). These codes are closely related to the lattices , and , and have the same principal parameters , and . In the case of the two-state code based on , it is possible to make the minimum squared distance between distinct paths equal to (use the partition chain , and let one of the two bits control each of these two-way partitions; see Class III), so that the error coefficient achieves its minimum possible value, .
Class II Codes
Let again be a -way lattice partition with . Let be a rate- -state convolutional encoder as shown in Fig. 13(b), with information bits entering in each time unit, two units of memory, and output bits, representing the inputs one time unit earlier, and representing the mod- 2 sum of the current inputs with those two time units earlier, which together select one of the cosets of .
The trellis diagram has states, with branches leaving and entering every state. Because the distance between paths is at least when they diverge, when they merge, and in some other time unit, the distance between distinct paths is at least . Thus , the distance within cosets of , and the error coefficient is the same as that of (the minimum possible for any coset code based on the partition ). The code is noncatastrophic because every nonzero input creates a nonzero output one time unit later.
TABLE XI
| Λ | 2v | dB | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Class I codes | |||||||||||
| 2 | 2 | 1/2 | 1/2 | 2 | 1.51 | 4 | 5 | 1.51 | |||
| 2 | 2 | 1 | 4 | 2 | 3.01 | 44 | 13 | 2.32 | |||
| 4 | 4 | 2/4 | 3/2 | 8 | 4.52 | 252 | 67 | 3.24 | |||
| 8 | 16 | 4/8 | 2 | 16 | 4 | 6.02 | 1614 | ||||
| Class II codes | |||||||||||
| 4 | 4 | 1/2 | 1 | 4 | 2 | 3.01 | 12 | 16 | 2.69 | ||
| 8 | 16 | 2/4 | 3/2 | 8 | 4.52 | 60 | 88 | 3.73 | |||
| 16 | 256 | 4/8 | 2 | Class III codes | 4 | 6.02 | 540 | 2544 | 4.61 | ||
| 2 | 2 | 1/2 | 1 | 3 | 3/2 | 1.76 | 8 | 8 | 1.56 | ||
| 4 | 4 16 | 2/4 | 3/2 | 6 | 3.27 | 46 | |||||
| 8 | 16 | 4/8 | 2 | 16 | 4 | 6.02 | 1020 | 684 | 4.42 | ||
| Class IV codes | |||||||||||
| 2 | 4 | 1/2 | 1 | 4 | 2 | 3.01 | 4 | 16 | 3.01 | ||
| 4 | 16 | 2/4 | 3/2 | 8 | 4.52 | 12 | 88 | 4.20 | |||
| 8 | 256 | 4/8 | 2 | 16 | 4 | 6.02 | 60 | 2544 | 5.24 | ||
| Class V codes | |||||||||||
| 4 | 2 | 1/2 | 1/2 | 2 | 1.51 | 4 | 5 | 1.51 | |||
| 4 | 2 | 1/2 | 1 | 4 | 2 | 3.01 | 44 | 13 | 2.32 | ||
| 8 | 8 | 3/4 | 5/4 | 8 | 5.27 | 764 | 90 | 3.75 | |||
| 16 | 128 | 7/8 | 13/8 | 16 | 7.15 | 5632 | |||||
| 8 | 4 | 2/3 | 1/2 | 4 | 4.52 | 316 | 37 | 3.25 | |||
| 16 | 8 | 3/4 | 1/4 | 4 | 5.27 | 1692 | 89 | 3.52 | |||
| 16 | 64 | 6/7 | 3/4 | 8 | 6.77 | 1792 | |||||
| Class VI codes | |||||||||||
| 4 | 4 | 1/2 | 1 | 4 | 2 | 3.01 | 12 | 16 | 2.69 | ||
| 8 | 64 | 3/4 | 5/4 | 8 | 5.27 | 60 | 300 | 4.49 | |||
| 16 | 7/8 | 13/8 | 16 | 7.15 | 540 | 5.73 | |||||
| 8 | 16 | 2/3 | 1/2 | 4 | 4.52 | 60 | 58 | 3.73 | |||
| 8 | 64 | 3/4 | 1/4 | 4 | 5.27 | 284 | 194 | 4.04 | |||
| 16 | 6/7 | 3/4 | 8 | 6.77 | 540 | 5.36 | |||||
| Class VII codes | |||||||||||
| 2 | 2 | 1/2 | 1 | 3 | 3/2 | 1.76 | 8 | 8 | 1.56 | ||
| 4 | 4 | 2/3 | 1/2 | 3 | 3.27 | 24 | 30 | 2.75 | |||
| 8 | 8 | 3/4 | 1/4 | 3 | 4.02 | 74 | |||||
| 16 | 16 | 4/5 | 1/8 | 3 | 4.39 | 166 | |||||
| 8 | 16 | 4/5 | 1 | 6 | 3 | 4.77 | 212 | ||||
| Class VIII codes | |||||||||||
| 2 | 4 | 1/2 | 1 | 4 | 2 | 3.01 | 4 | 16 | 3.01 | ||
| 4 | 16 | 2/3 | 1/2 | 4 | 4.52 | 12 | 72 | 4.20 | |||
| 8 | 64 | 3/4 | 1/4 | 4 | 5.27 | 60 | 284 | 4.49 | |||
| 16 | 256 | 4/5 | 1/8 | 4 | 5.64 | 284 | 1096 | 4.41 | |||
| 8 | 256 | 4/5 | 1 | 8 | 4 | 6.02 | 60 | 2072 | 5.24 |
Table XI gives the parameters for Class II codes based on the partitions , and . Their parameters, including coding gain, are the same as those of Class I codes, except that the increase of the number of states to equal the order of the partition results in a reduction of the error coefficient to its minimum value. The decoding complexity increases only modestly because it is dominated by the complexity of decoding .
Class III and IV Codes
Let now be a chain of two -way partitions with distances . For a Class III code (resp. Class IV), let be the same rate- encoder as in Class I (resp. Class II), but now let the first set of output bits select the coset of in the partition , and the second set of output bits select the coset of in the partition , as shown in Fig. 13(c). These encoders are still noncatastrophic.
Because the label selecting the coset of in the partition is zero at the time of the first nonzero input, the set of possible initial code symbols (the time-zero lattice ) is , so the minimum squared distance between distinct paths is where they first diverge. In the case of Class III codes, the minimum squared distance between distinct paths is where they merge, so . In the case of Class IV codes, the set of possible final code symbols is also , so that the minimum squared distance between distinct paths is where they merge, and also at least at some other time, so that the between distinct paths is . This means that , and furthermore the normalized error coefficient is the same as that of .
Table XI gives the parameters for Class III codes based on the partitions , and , which have orders 4,16 , and 256 , and depths 2,3 , and 4 , respectively. The first is a noncatastrophic Ungerboeck-type two-state code with (this code appears as a phase-modulated code in Divsalar et al. [25]). The others are CS-type codes. For the code, we use the fact that there exists an alternative partition , where , like , is a version of with , such that the system of coset representatives is also a system of coset representatives for (see part II); this ensures that when paths merge as well as when they diverge, so that .
Table XI gives the parameters for Class IV codes based on the same partitions , and . The first is Ungerboeck's four-state code, which is the prototype of this class. These are CS-type codes that are closely related to Class III codes, except that they have twice the constraint length and the minimum squared distance and coding gain (except in the last case, where only the error coefficient improves). Again, these codes are closely related to the lattices , and and to the corresponding Class I and II codes and have the same principal parameters. In fact, even the decoding complexity is the same as that of Class II codes, but the error coefficient is still further reduced.
Class V Codes
Let now be a -way partition with distances , and let be a rate- convolutional encoder as shown in Fig. 13(d), with information bits entering in each time unit, and output bits generated as follows. Let be a linear (modulo 2 ) circuit with input bits, namely the -tuple of input bits delayed by one time unit, and output bits. One output bit goes directly to the coset selector; the remaining bits are added (modulo 2) to the -tuple , and the -bit sum forms the remaining inputs to the coset selector.
The circuit need have only the following two properties: a) its outputs are all-zero only when its inputs are all-zero, and b) there is no infinite input sequence ( ) into that generates a finite output sequence from (so that the code is noncatastrophic). A simple circuit with these properties is the one whose output is simply the ( )-tuple ( ), where the leftmost bit is the one that goes directly into the coset selector. Property a) is obvious. Property b) follows from the fact that if , then there is no sequence ( ) such that , etc., since can only match the low-order bits in can then only match the low-order bits in , etc., and so eventually the highest order nonzero bit in "shifts" to the highest order position, where it cannot be matched.
Property a) ensures that the minimum squared distance is at least , because two distinct paths differ by at least where they diverge and where they merge. The multiplicity of sequences at distance from any given sequence starting at any given time is
where is the number of points of weight in , and is the number of points of weight in any nonzero coset of (if it is the same for all such cosets).
The coefficient of follows from the observation that in the code trellis, starting from a given zero state and ending at some later zero state, there are nonzero paths of length nonzero paths of length 3 , and so forth, up to nonzero path of length , so that the total number of nonzero paths is (this is generally true for any noncatastrophic rate- encoder; see Forney [26]).
Table XI gives the parameters for Class V codes based on the partitions , and , as in Class I, as well as , and . The first two are just the two-state Class I codes again, since . The third code is a code equivalent to the eight-state CS code based on , which may be considered as the prototype of this class. The fourth is a 128-state code with a coding gain in excess of 7 dB , but with a huge error coefficient and decoding complexity. The last three further illustrate that Class V codes attain large coding gains for relatively few states (and thus small decoding complexity) but with outsize error coefficients: the fifth code attains with only four states, while the last gets considerably beyond 6 dB with only 64 states. (In unpublished work, Wei had earlier constructed codes based on such 16dimensional partitions as , with comparable coding gains.) Even if the effective number of states is taken as the order of the partition rather than (since the decoding complexity is dominated by decoding ), so that the effective number of states doubles, these are still very good (ignoring ).
Class VI Codes
Once again, let be a -way partition with distances , but now let be a rate convolutional encoder as shown in Fig. 13(e). This is the same as the Class V encoder, including the circuit , except that there is a second memory element, and its output is further added to the -tuple output of the encoder of Fig. 13(d).
Property b) of circuit again ensures that the code is noncatastrophic.
Furthermore, it ensures that if the input sequence has a finite number of nonzero , then the encoder outputs are nonzero at at least three different times: once when the sequence begins, once at some inter- mediate time when has a nonzero output, and once when the last nonzero input finally leaves the encoder. Consequently, the minimum distance between two distinct paths is at least , so that the minimum distance of the code is
, and the error coefficient is the same as that of .
Table XI gives the parameters for Class VI codes based on the same partitions as for Class V (except for , where no improvement is achieved). The four-state code is again a Class II code, and the 64-state code is equivalent to the corresponding Wei/CalderbankSloane code. The remaining codes illustrate that Class VI codes have the same coding gains as Class V codes, but with reasonable error coefficients, at the cost of increased decoding complexity (vastly increased, for those codes with gains more than 6 dB ).
Class VII and VIII Codes
Now let be a two-level partition chain with distances and orders 2 and . Let be a rate- encoder as in Fig. 13(d) and (e), but with one output bit selecting one of the two cosets of in the partition , and the remaining bits selecting a coset of in the partition , as shown in Fig. 13(f).
Again, the codes are noncatastrophic. With Class VII codes, as with Class III, two distinct paths differ by at least where they diverge and where they merge, so that . With Class VIII codes, as with Class IV, two distinct paths differ by at least where they diverge, by at least at some intermediate time, and by at least where they merge, so that the minimum squared distance between distinct paths is at least . Hence , and the error coefficient is the same as that for .
Table XI gives the parameters for Class VII and Class VIII codes based on the partitions , and , which have orders , and 32, and depths , 2, and 3, respectively. The first Class VII code is the two-state Class III code again, and the first Class VIII code is again Ungerboeck's four-state code. The remaining codes are Wei-type codes. In particular, the second and third Class VIII codes correspond to Wei's 16-state fourdimensional and 64 -state eight-dimensional codes, which are the prototypes of this class, and the last two Class VIII codes are 256 -state elaborations of codes that Wei investigated for , and 128. The Class VII codes are closely related to Class VIII codes, except that they have half the state space dimension and the minimum squared distance and coding gain.
VII. Discussion
A large number of codes have been discussed in a common framework in this paper. In this section we draw what conclusions seem warranted by the evidence.
- Trellis codes and lattice codes are comparable, with respect to fundamental parameters such as fundamental coding gain versus number of states . Considering the sequence of Barnes-Wall lattices, we see that it takes two states to get , four states to get ( 3.02 dB ), 16 states to get , and 256 states to get . The depths of these lattices are , and 4 ; their redundancies are , and 2 ; and their minimum squared distances are , and 16 .
These properties are shared by the generic trellis codes that we have called Class I, II, and IV, which include a two-state code based on the partition and the four-state Ungerboeck code based on the partition , both with minimal error coefficient .
All of the trellis codes that achieve require 16 states, except for the GCS/Wei eight-state four-dimensional code, which has an error coefficient of , and four-state Class I and V codes, whose error coefficients are very large and whose decoding complexity is not that much less than that of the Wei/Class VIII 16 -state four-dimensional code, for example.
Note also that the lattices and achieve and with 8 and 16 states, respectively, but with and , like the Wei codes.
There is a nearby cluster of codes that achieve ( 4.77 dB ), with either and , or , and ; e.g., the 16- and 32-state two-dimensional Ungerboeck codes, the 16-state CS/Eyuboglu codes, or the 16 -state Class III and Class VII codes.
There is another cluster of codes at . While there are codes that achieve this fundamental coding gain with as few as eight states (e.g., the eight-dimensional CS code, or two of the Class V codes), it seems to take 32 or 64 states to get reasonable error coefficients (e.g., the Wei or CS eight-dimensional codes).
To achieve , all of the trellis codes with reasonable error coefficient ( ) require 256 states. There are such codes with as few as 16 states (e.g., Class I ) but with very large error coefficients and without as much saving in decoding complexity as the low number of states would suggest (because most of the complexity occurs in decoding the lattice partition). There are a number of good 128 -state codes, but there are also lattices ( and ) that achieve nearly 6 dB with 128 states. The 256 -state one-dimensional Ungerboeck code is remarkable: it obtains with minimal error coefficient , and with quite low decoding complexity. Note that it has and , like and .
In summary, we propose a folk theorem: it takes two states to get 1.5 dB , four states to get states to get 4.5 dB , perhaps 64 states to get 5.25 dB , and 256 states to get 6 dB , as long as we require a reasonably small error coefficient (for trellis codes). 2) Trellis codes are better than lattice codes, if we consider effective coding gain versus decoding complexity. Granted, our measure of effective coding gain is based on a rule of thumb that is only approximately valid for moderately low error rates and that generally does not take into account neighbors other than the nearest; granted also, our measure of decoding complexity is based specifically on the algorithms of part II and is highly implemen-tation-dependent. We have not even given an effective coding gain for lattice codes because our rule of thumb is questionable when the number of nearest neighbors is so high. Nonetheless, it seems clear that the very large error coefficients of lattice codes will mean that their effective performance will be significantly inferior to that of comparable trellis codes. For codes with the same parameters, due to the many symmetries of the lattice codes, the decoding complexity does seem to increase slightly as we go from a lattice code to Class I to Class II to Class IV-i.e., as the code becomes "more convolutional"-but this slight effect is very much outweighed by the large reduction in error coefficient. 3) It is best to keep the redundancy as small as possible, within reasonable limits. The densest lattices are all selfdual, so their redundancy is equal to half their depth (the comparable trellis codes use rate- encoders). Ungerboeck [8] made the point, using channel capacity arguments, that there is little to be gained by going beyond 1 bit of redundancy per symbol, i.e., by using rate- ( ) encoders with . Wei [11] recognized that, by going beyond two dimensions, the normalized redundancy could be reduced below one and thus that good codes could be obtained with small constellation expansion. The evidence of the codes presented here is that, while the very best codes (e.g., Ungerboeck's four-state two-dimensional code, or all of his one-dimensional codes) may have informativity equal to redundancy, like the best lattices, there is very little loss if redundancy is reduced as long as we do not go to extremes (e.g., Wei's 16-dimensional codes, with ). Compare, for example, the Ungerboeck-type two-dimensional codes with the one-dimensional; or the codes (or lattices) with that achieve , versus those with . 4) The Ungerboeck codes are still the benchmark. Comparing all codes shown in Fig. 12(a)-(c), we see that little improvement has been achieved over Ungerboeck's original results. The one-dimensional codes are generally slightly better than the two-dimensional codes, but this is offset by their normalized redundancy of , which gives a constellation expansion factor of four, versus the two-dimensional redundancy of , which gives a constellation expansion factor of two. Some of the codes found by Eyuboglu are slightly better, but this is not surprising because any -state rate- code can also be regarded as a -state rate- code. Some of Wei's four-dimensional codes are also slightly better; this is more surprising and significant because these codes also have normalized redundancy . (The Wei eightdimensional codes are also in the vicinity of the twodimensional Ungerboeck codes but with ; the comparable Wei/Calderbank-Sloane codes have slightly higher decoding complexity and, more importantly, .)
Of all the codes we have considered, a few stand out as "special." The four-state two-dimensional Ungerboeck code is certainly in this category because it is the unique code with and and because of its symmetries and close relationship to the special lattice . As mentioned before, the 256 -state one-dimensional Ungerboeck code is also special because it is a code with and , which makes it the trellis cousin of the very special lattice . The 16-state four-dimensional Wei code is the single code that most clearly improves on the Ungerboeck-type codes; note that it has the same parameters as the lattice .
(However, could there be a 16 -state code with -i.e., with the same parameters as -that achieves or 8 ? or a 16-state, code that achieves , or 8 ?)
These results suggest that there is little likelihood of finding significantly better codes in terms of the parameters that we have considered. Above 5 dB , where the curves for the Ungerboeck-type codes begin to tail off, there could be codes with 64 or more states that are superior to those known, although it is also possible that this is close enough to channel capacity that the performance/complexity curve will tend to saturate for all codes. We do not expect to need depths more than three to four in this range, so in view of Lemma 5 a systematic search of codes should settle the question. (Ternary codes may also be attractive in this region; see [16].)
VIII. Conclusion
We have defined coset codes in such a way as to embrace all of the good known codes and to suggest a large variety of extensions. Their characterization in terms of geometrical parameters like the fundamental coding gain turns out to be quite simple and allows us to sort out from the variety of schemes that have been proposed those that seem to have the best combinations of coding gain, decoding complexity, and constellation expansion.
With respect to those parameters, Ungerboeck's original codes continue to stand up very well vis-à-vis the rest of the codes considered. Wei's codes probably represent the most significant improvement, particularly because they reduce the constellation expansion factor below two, while achieving some gains in coding gain and decoding complexity. While the codes of Calderbank and Sloane do not rise to the top in any of our comparisons, their introduction of the lattice/coset viewpoint has clearly been the most significant conceptual contribution since Ungerboeck.
In the opinion of the author, while many of the best codes may have already been discovered, the fields of coset codes and trellis codes are no further developed than that of ordinary coding theory in the early 1960's. There may well be better codes still to be discovered in the range, as indicated in the previous section. Suboptimal decoders should be investigated, as well as codes specifically tailored for such decoders. The design of good sphere packings in large dimensions is a topic of active mathematical interest, and the development of still more powerful trellis codes is wide open. Codes which combine good coding gain with other properties, such as rotational invariance and decoding delay, will be important for applications. The theory of phase-modulated coset codes should be brought along in parallel with that of the lattice-type codes. The combination of these codes with spectral shaping requirements, e.g., signalling for partial-response or other band-limited channels, is an important topic. The vector quantization problem is dual to the sphere packing problem and in the block case has been attacked successfully with lattices; there should also be good trellis quantizers. The question of how to design good multidimensional constellations is not closed. Finally, it seems that mathematicians should be interested in trellis codes, particularly as infinite-dimensional generalizations of fi-nite-dimensional sphere-packings. As in other parts of information theory, the interplay between theoretically and practically motivated research is likely to prove fruitful for some time.
Acknowledgment
This work was directly stimulated by the work of L.-F. Wei on multidimensional trellis codes. Remarks by A. R. Calderbank and preprints of the Calderbank-Sloane papers were most helpful in pointing the way to the lattice/coset viewpoint. G. R. Lang kindly provided references to the early history of lattices in communications. I am indebted to M. V. Eyuboglu for providing some of the code parameters and for permission to publish the improved codes cited in the text. I am grateful for comments on earlier versions of this paper by J. B. Anderson, A. R. Calderbank, R. G. Gallager, M. I. Klun, G. Ungerboeck, and L.-F. Wei.
References
[1] G. D. Forney, Jr., R. G. Gallager, G. R. Lang, F. M. Longstaff, and S. U. Qureshi, "Efficient modulation for band-limited channels," IEEE J. Select. Areas Commun., vol. SAC-2, pp. 632-647, 1984. [2] E. S. Barnes and G. E. Wall, "Some extreme forms defined in terms of Abelian groups," J. Australian Math. Soc., vol. 1, pp. 47-63, 1959. [3] J. Leech, "Notes on sphere packings," Can. J. Math., vol. 19, pp. 251-267, 1967. [4] H. S. M. Coxeter, Twelve Geometric Essays. Carbondale, IL: Southern Illinois Univ. Press, 1968. [5] R. deBuda, "The upper bound of a new near optimal code," IEEE Trans. Inform. Theory, vol. IT-21, pp. 441-445, 1975. [6] J. Leech and N. J. A. Sloane, "Sphere packings and error-correcting codes," Can. J. Math., vol. 23, pp. 718-745, 1971. [7] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups. New York: Springer-Verlag, 1988. [8] G. Ungerboeck, "Channel coding with multilevel/phase signals." IEEE Trans. Inform. Theory, vol. IT-28, pp. 55-67, 1982. [9] L. F. Wei, "Rotationally invariant convolutional channel coding with expanded signal space. Part II: Nonlinear coding," IEEE J. Select. Areas Commun., vol. SAC-2, pp. 672-686, 1984. [10] A. R. Calderbank and N. J. A. Sloane, "Four-dimensional modulation with an eight-state trellis code," AT&T Tech. J., vol. 64, pp. 1005-1018, 1985. [11] L. F. Wei, "Trellis-coded modulation with multidimensional constellations," IEEE Trans. Inform. Theory, vol. IT-33, pp. 483-501, 1987. [12] A. R. Calderbank and N. J. A. Sloane, "An eight-dimensional trellis code," Proc. IEEE, vol. 74, pp. 757-759, 1986. [13] __, "New trellis codes based on lattices and cosets," IEEE Trans. Inform. Theory, vol. IT-33, pp. 177-195, 1987. [14] G. D. Forney, Jr., and L. F. Wei, "Multidimensional signal constellations," in preparation, 1989. [15] F. Pollara, R. J. McEliece, and K. Abdel-Ghaffar, "On finite-state codes," submitted to IEEE Trans. Inform. Theory, 1987. [16] G. D. Forney, Jr., "Coset codes - Part III: Ternary codes, lattices, and trellis codes," in preparation, 1989. [17] A. LaFanchere, R. H. Deng, and D. J. Costello, Jr., "Multidimensional trellis coded phase modulation using unit-memory and par-tial-unit-memory convolutional codes," submitted to IEEE Trans. Inform. Theory, 1987. [18] G. D. Forney, Jr., "Coset codes-Part II: Binary lattices and related codes," IEEE Trans. Inform. Theory, this issue, pp. 1152-1187. [19] E. L. Cusack, "Error control codes for QAM signalling," Electron. Lett., vol. 20, pp. 62-63, 1984. [20] G. D. Forney, Jr., "Convolutional codes I: Algebraic structure," IEEE Trans. Inform. Theory, vol. IT-16, pp. 720-738, 1970. [21] G. Ungerboeck, "Trellis-coded modulation with redundant signal sets. Part II: State of the art," IEEE Commun. Mag., vol. 25, no. 2, pp. 12-21, 1987. [22] M. L. Honig, "Optimization of trellis codes with multilevel amplitude modulation with respect to an error probability criterion," IEEE Trans. Commun., vol. COM-34, pp. 821-825, 1986. [23] G. J. Pottie and D. P. Taylor, "An approach to Ungerboeck coding for rectangular signal sets," IEEE Trans. Inform. Theory, vol. IT-33, pp. 285-290, 1987. [24] D. Divsalar and M. K. Simon, "Multiple trellis-coded modulation (MTCM)," preprint, 1986. [25] D. Divsalar, M. K. Simon, and J. H. Yuen, "Trellis coding with asymmetrical modulations," IEEE Trans. Commun., vol. COM-35, pp. 130-141, 1987. [26] G. D. Forney, Jr., "Structural analysis of convolutional codes via dual codes," IEEE Trans. Inform. Theory, vol. IT-19, pp. 512-518, 1973.
[^0]: Manuscript received September 2, 1986; revised September 18, 1987. This paper was presented at the 1987 Information Theory Workshop, Bellagio, Italy, June 24, 1987.
The author is with the Codex Corporation, 7 Blue Hill River Road, Canton, MA 02021.
IEEE Log Number 8824503.