SEMI-ANNUAL STATUS REPORT

Investigation of Bandwidth-Efficient Coding and Modulation Techniques

Period Covered July 1991 - January 1992

NASA Grant No. NAG 5-1392

Principal Investigator: Dr. William P. Osborne

New Mexico State University
Electrical and Computer Engineering
Box 30001 - Dept. 3-0
Las Cruces, New Mexico 88003
EXECUTIVE SUMMARY

New Mexico State University (NMSU) is studying for the National Aeronautics and Space Administration (NASA) the necessary technology to improve the bandwidth efficiency of the space-to-ground communications network using the current capabilities of that network as a baseline. This study is aimed at making space payloads, for example the Hubble Space Telescope, more capable without the need to completely re-design the link.

In particular, the study addresses:

a) What are the requirements necessary to convert an existing standard 4-ary phase shift keying communications link to one that can support, as a minimum, 8-ary phase shift keying with error corrections applied.

b) Determine the feasibility of using the existing equipment configurations with additional signal processing equipment to realize the higher order modulation and coding schemes.

These efforts are being undertaken by the faculty and students of New Mexico State University. Continued direct interactions with personnel at Goddard Space Flight Center and the White Sands Ground Terminal will be required for successful completion of the project.
During the Period
Results July 1991 to January 1992

During this period, the studies on the Grant have been focused in two areas. These are:

(1) One of the drawbacks to introducing any new modulation schemes into NASA's space network is the need for new modern equipment and the potential for supporting both old and new modulation types for many years. In order to minimize these logistics problems NMSU has been studying multi-mode modem/codec units. These designs have complexity essentially equal to that of a normal coded QPSK modem while processing coded QPSK as well as coded 8PSK and coded 16PSK signal sets. The study of these modems is discussed in section 1.

(2) 8PSK Carrier Synchronization - For TCM to become practical and to support the TDRSS TCM demonstration being performed by the NMSU Telemetry Center at the WSGT, it is necessary that a design of a carrier tracking loop capable of supporting coded 8PSK be found. During the studies performed under this grant a MAP phase estimator for 8PSK has been found and analyzed. The performance analysis of this estimation is discussed in section 2.
Section 1 - Multi-Mode Modem/Codec Designs

Introduction

In many applications, it is economical for a single modem or codec unit to receive multiple modulation and coding formats. In particular, the use of BPSK and QPSK with constraint length 7, rate one half convolutional encoding and Viterbi decoding has become a de facto standard in many military and NASA systems. However, there is significant interest in employing 8PSK and 16PSK trellis coded modulation (TCM) in these same systems today in order to conserve bandwidth while retaining the advantage of coding gain.

This study addresses the design of an integrated modem/codec unit which can receive coded and uncoded BPSK and QPSK using the de facto standard coding schemes as well as 8PSK-TCM and 16PSK-TCM. This design is totally compatible with today's modulation schemes and capable of processing tomorrow's TCM codes. This is accomplished in the modem by using quadrature channel carrier recovery processing and a version of the MAP phase detector algorithm. The symbol synchronization is accomplished with a derivative of an early-late gate designed to accommodate multi-level signals. The coding scheme is the de facto standard for the BPSK and QPSK cases using an off-the-shelf Viterbi decoder chip. The TCM decoding uses this same chip with unique outboard circuity to convert the I&Q signals resulting from the inphase and quadrature correlations of the received 8/16 PSK signals into a signal set that can be decoded with the off the shelf decoder chip.

An overview of the modem/codec design is given next. Details of the design and the performance of the carrier loop, symbol synchronizer and codec units are given in the following part. The multimode modem/codec unit described in this brief is capable of demodulating coded BPSK, QPSK, 8PSK and 16PSK. The coding gains achieved are 5.5dB, 5.5dB, 3.0dB and 3.0dB respectively. The overall complexity of the unit is very similar to the complexity of a standard BPSK/QPSK design.
Overview

The overall block diagram of the proposed multimode modem/codec unit is shown in figure 1. Note, only the receive side functions as the transmit side is a trivial extension of current QPSK units. The received IF signals consist of the sum of an encoded MPSK signal (M=2, 4, 8 or 16) plus wideband gaussian noise. This signal is first processed by a carrier tracking loop and demodulator unit. This unit is a generalized costas crossover loop of the variety suggested in [11]. The carrier loop supplies the output of the integrate and dump matched filters in a quantized soft decision format to the codec unit and the analog signals from the coherent detection process to the symbol synchronizer.

The symbol synchronizer supplies the received clock for the carrier loop integrate and dump filters as well as the codec. The symbol synchronizer is a generalized early/late gate design adapted to track the multi-level signals found at the demodulator output when 8/16 PSK is applied to its input.

The codec unit is designed to provide soft-decision decoding of 2, 4, 8 and 16 PSK. The coding employed for BPSK and QPSK are the industrial standard R= 1/2, K=7 convolutional codes. The coding for 8/16PSK is the "pragmatic" TCM approach recently suggested by Viterbi [2]. This "pragmatic" scheme employes the same codes as the BPSK/QPSK signal set with set partitioning and outboard decisions as appropriate for 8/16 PSK transmission (see the codec part for further explanation). Using this approach it is possible to employ an existing Viterbi decoder chip and a minimal amount of outboard logic to decode all four modulation formats as described later, ie, this unit like the others becomes a generalized MPSK processor.

Carrier Tracking Loop and Coherent Demodulator

In a multi-modulation scheme environment such as the modem proposed in this paper it is advantageous to consider the use of a single carrier tracking system. The complexity of synchronous communication systems alone warrant this consideration. Further, concern for limited space, cost, and power strengthen the argument in favor of a single tracking system. The Costas crossover loop is typically used in the tracking of BPSK and QPSK signals. However, the hard-limiters employed by the loop make it inappropriate for use in higher order modulation schemes such as 8 and 16-PSK systems where the modulation data takes on values other than ± 1.

The use of decision directed loops that employ maximum a-posteriori estimation techniques has been proposed for tracking of MPSK modulation [1]. The technique employs a quadrature channel
carrier recovery loop and a polar phase estimator. Using the output of the quadrature channel matched filters, the polar phase estimator makes a hard decision as to what modulation data was transmitted during the last symbol period. This estimate is then used in conjunction with the filter outputs to generate an error signal. The error signal is passed to a loop filter and VCO which generates the local carrier reference for demodulation. Figure 2 shows the block diagram of the MAP estimation loop.

The MAP estimator performs several functions in making its decision as to what was transmitted. First it obtains the phase angle that is conveyed with I and Q by taking the arctangent of the ratio Q/I. The angle is then compared with each possible modulation angle (e.g., in 8PSK the modulation angles could be chosen as $\pi/16$, $3\pi/16$, $5\pi/16$, $7\pi/16$, $9\pi/16$, $11\pi/16$, $13\pi/16$, and $15\pi/16$). The modulation angle that is closest to the received angle is selected as the maximum a posteriori estimate to the transmitted angle. The cosine and sine of the estimate are taken and used to generate an error signal.

To form the error signal the output of the I and Q matched filters are multiplied by the sine and cosine angle estimates, respectively. This is shown in figure 2. The difference between the two products is the error signal. This is shown, in Figure 2 as the input to the filter. The first part of this difference will, in a high SNR environment, average to zero. This follows since under high SNR conditions it would be expected that the aposteriori estimate of the received value of I will equal the transmitted value of I and a similar operation would take place on the Q channel. The second part of the error signal will reduce to $\sin(\phi)$ in a high SNR environment since $I^2 + Q^2$ will be equal to 1. This is the traditional PLL tracking error quantity which occurs with a mixing phase detector and pure sinewave inputs. With the use of a filter whose Laplace transform is $1+a/s$ and in the absence of symbol errors this tracking system performs identical to a 2nd order PLL.

The use of the MAP estimator in an MPSK modem can be easily implemented using digital logic. The I and Q channels are sampled and the digital information is passed to a set of erasable-programmable-read-only-memories (EPROM). The EPROM is used not only to generate the MAP estimates of I and Q but to perform the error signal calculation as well. The current modem design utilizes digital integrate and dump filters, EPROMs, a digital loop filter, and a numerically controlled oscillator (NCO). The use of digital circuitry has greatly reduced the complexity of the overall modem design.
Selecting the modulation type is simply a matter of selecting the correct EPROM addresses for the phase detector output.

To test the digital implementation technique a simulator was constructed using the C programming language. A baseband model of the carrier tracking system was developed for the simulation. The simulator quantizes the I and Q data channels which are generated directly. The quantizing mimics proposed hardware designs for the modem. The I and Q data is quantized to 8 bits each. The seven most significant bits of each are used by the before mentioned EPROMs. Since this part of the modem design is the most novel, much of the simulations conducted focused on it. To demonstrate that the simulations were accurate, normalized step responses of the carrier tracking loop simulator for BPSK, QPSK, 8PSK, and 16 PSK formats and random data were obtained. Figure 3 shows the normalized step response for 16 PSK. The damping factor of the second order loop is 0.5. These simulator responses were compared to theoretical responses [3, p. 49] and determined to be accurate.

The most important performance parameter associated with carrier tracking loops is the phase jitter in the loop versus SNR, since the jitter results in detection performance loss. The variance is due to two factors. The first occurs in all PLLs and reflects the presence of noise in the incoming signal. The second factor is attributable to the use of a particular type of carrier tracking loop, in this case a decision directed loop. This second factor is most often referred to as the squaring loss. Its theoretical calculation for the modem’s MPSK MAP estimation carrier tracking loops is presented in Section 2.

The simulator was used to measure the variance in the phase error versus SNR and these results are shown in Figures 4 and 5. The first plot shows the variance in phase error for BPSK and QPSK. The theoretical approximations for the variances, using the approach in Section 2 are plotted as well. Figure 4 shows the variance in phase error for 8 PSK and 16 PSK. The quantity, $E_s/N_o$, present in the plots, is PSK symbol energy to noise spectral density. Figure A-4 shows the theoretical squaring loss calculations for all four modulation techniques. Note, the very large differences in squaring loss between the various schemes. At a SNR where squaring loss may be negligible for BPSK it can be prohibitory for 16 PSK. Operating at an $E_s/N_o$ of 10 dB incurs no loss in loop performance due to squaring loss for BPSK but for 16 PSK there is a more than 40 dB loss in loop performance that must be considered.

One aspect of the simulation that is worth noting is the change in loop noise bandwidth that occurs in the simulator as the SNR changes with constant signal level (perfect AGC). Since the phase
detector gain is a function of error rate and hence SNR, as the SNR drops the phase detector gain drops and the corresponding loop noise bandwidth gets smaller. This in effect lowers the amount of jitter that is present when compared to a similar calculation made with a fixed loop bandwidth. The theoretical calculations of phase jitter based upon squaring loss and a linear model must take into account the changing loop noise bandwidth that occurs in the simulator (and in most actual loops), if they are to be compared with the phase error variances of the simulator. This was done for our theoretical results in figure 4 & 5. It is interesting to note that this aspect of the simulator more actually mimics practice than the theory does. As a rule, the phase detector gain of a carrier tracking loop is not modified on the fly to account for a changing SNR operating condition in an attempt to keep the loop noise bandwidth constant.

Symbol Synchronization

The matched filters of the I and Q quadrature data channels, shown in Figure 2, require an accurate symbol timing reference to optimally demodulate the data. This timing reference can be achieved by the use of a closed loop symbol synchronizer. Early-late gate symbol synchronizers are frequently employed when bipolar data symbols such as those of PSK are being used [3, p. 235]. This type of closed loop synchronizer uses two integrators to generate relative timing offsets which can be used to generate an error signal for a VCO to track on. The two integrators, designated as the early and the late integrators both perform gated integrations on one of the data channels. The width of the gates and their relative positions in time to the local estimate of the received clock signal can vary. The synchronizer that is being employed by this modem can track on symbols from any of the four PSK schemes mentioned previously. Figure 6 shows the symbol synchronizer being used. The gate width is T, the symbol period. The offsets of the gates, relative to the VCO, are -T/4 and +T/4 for the early and late integrators, respectively. The timing of the gates are those available on a commercial BPSK and QPSK symbol synchronizer chip that will be used to track 8 PSK and 16 PSK as well; as long as the input to the symbol synchronizer is not limited or forced to be ±1 in some other way. The symbol sync employed in this modem utilizes a linear input and true integration.

The simulations of this type of early-late gate synchronizer have demonstrated the successful tracking of 8/16PSK signals. One way to measure the performance of the early-late gate symbol synchronizer is to measure the variance of the timing jitter as was
done for the carrier tracking loop. Figure 7 shows the variance of the timing error of the simulated synchronizer for QPSK and 8 PSK. As with the carrier tracking loop this data reflects the increase in jitter with decreasing SNR. Loop noise bandwidth, once again, plays an important role in evaluating performance. The phase detector gain changes for the different modulation schemes. The reason for this is the occurrence of relatively small differential amplitude transitions with higher modulation types. For BPSK the transitions that occur are always between +1 and -1. For QPSK the occurring transitions for I or Q are also between +1 and -1. However when 8PSK or 16 PSK is employed the average amplitude change when a transition occurs is almost half that of BPSK and QPSK. The result is that the phase detector gain and thus the loop noise bandwidth are twice as large for BPSK and QPSK as they are for 8 PSK and 16 PSK.

Codec Unit

The codec unit is build around an off-the-shelf Viterbi decoder chip that was originally designed for soft-decision decoding of the standard R=1/2, K=7 convolutional code. The decoding capability of this chip is extended to TCM codes by employing the "pragmatic" approach to TCM and adding a small amount of external logic to implement this pragmatic approach.

Pragmatic TCM, introduced by Viterbi[6], is illustrated here in figure 8. One of k data bits, referred to as the convolutional bit, is fed into a rate 1/2 convolutional encoder, which generates two codebits. The convolutional encoder is the industry standard 64-state encoder shown here in figure 9. The uncoded data bit(s), referred to as the outboard bits, together with the codebits, select the signal vector, so the code rate of pragmatic TCM is always k/(k+1). (Ungerboeck[4, 5, 6] has pointed out that for TCM in general, little is to be gained by reducing the code rate to less than K/K+1). In this brief, the term partition is used to refer to a set of signal vectors having the same codebits but different outboard bits. When pragmatic TCM is received, the decoded sequence is used to identify the most likely partition, then threshold decisions are used to identify the most likely vector from the selected partition. At SNR's at which operation is practical, the probability of incorrectly decoding the convolutional bit becomes insignificant, so the overall probability of error reduces to the probability of making an incorrect outboard decision. To minimize the probability of error, the signal vectors of a partition are made to be as far apart as the signal constellation will allow. As an example, the signal constellation for rate 2/3 TCM is shown in figure 10. As can be seen, there are four
optimistic, due to the error in decoding the demonstrate a coding uncoded 8PSK.

partitions: \(\{000, 100\}, \{001, 101\}, \{011, 111\}, \text{and} \{010, 110\}\). Each partition consists of two vectors which differ by 180 degrees of phase, the maximum distance possible in the 8-PSK constellation. The constellation for 16-PSK is shown in figure 11. In this case, a partition consists of four vectors separated by 90 degrees of phase.

The benefit of coding is determined by comparing the error probability of the coded system to that of an uncoded system with the same number of bits per symbol, so that the two systems have equal spectral efficiency. For example, coded 8-PSK and uncoded QPSK, both carry two bits per symbol. As explained earlier, the probability of error for pragmatic TCM reduces to the probability of an outboard decision error. In the case of 8-PSK, this is the probability of incorrectly selecting between two vectors at a distance of \(\sqrt{4E_s}\) from each other. The most likely error in QPSK is that of incorrectly selecting between two vectors which differ by \(\sqrt{2E_s}\). Therefore, the performance of coded 8-PSK is expected to be approximately equivalent to that of uncoded QPSK with twice the energy. Since the number of bits per symbol is the same, and therefore the energy per symbol is equivalent, this amounts to a coding gain of 3 dB. The approximation is not exact because 8-PSK has a nonzero probability of incorrectly decoding the convolutional bit, and QPSK has more than one vector at a distance \(\sqrt{2E_s}\) from the correct vector. Simulation, however, shows that pragmatic TCM does indeed achieve very close to 3 dB of coding gain at higher signal to noise ratios. Coded 16-PSK is compared to uncoded 8-PSK, as both carry 3-bits per symbol. In uncoded 8-PSK there are two nearest neighbor vectors at a distance of \(0.76537 \sqrt{E_s}\). In pragmatic coded 16-PSK there are two outboard decision vectors at a distance of \(\sqrt{2E_s}\), meaning that there is an asymptotic coding gain of \(\sqrt{2} / 0.76537\) or 5.3dB. In the case of 16-PSK, the asymptotic coding gain is overly optimistic, due to the fact that there is a non-zero probability of error in decoding the convolutionally encoded bit. Simulations demonstrate a coding gain of about 3.2 dB for 16PSK TCM over uncoded 8PSK.

The argument in favor of using pragmatic TCM as opposed to the best found TCM code is as follows: pragmatic TCM is straightforward to implement, uses a currently available industry standard decoder, and uses the same decoder for a variety of signal constellations, while sacrificing very little in coding gain compared to the optimal code. From the analytical work of Viterbi [2], the performance of the pragmatic approach is within 0.6dB of the best.
known Ungerboeck code \([3, 4]\) for 8PSK TCM and 0.1dB for 16PSK TCM. Using the same Viterbi decoder for a variety of signal constellations is important since the Viterbi decoder represents a relatively large ASIC.

**Implementation of 8-PSK Pragmatic TCM**

The Viterbi decoder chip will accept inputs in either of two modes: hard decision, in which the receiver makes a binary determination that the received codebit is a "0" or a "1", or soft decision, in which the receiver indicates, on some specified scale, the relative likelihood that the received codebit is a zero or a one. When Viterbi decoding is used with binary signalling, the use of soft decisions can improve performance by as much as 2 dB over hard decisions\([7]\). Typically, the soft decision is generated by the quantization of an antipodal signal received in the presence of additive white Gaussian noise. Usually, a scale of 0 through 7 (3-bit soft decision) would be used, although decoders which use a scale of 0 through 15 (four bit soft decision) are currently available. The decoder uses the soft decisions to calculate a branch metric to associate with each combination of codebits resulting from a state transition of the convolutional encoder. Some Viterbi decoder VLSI chips allow externally generated branch metrics to be supplied as alternatives to those which the decoder calculates internally from the soft decisions. This feature is provided so that adaptive metrics may be used, however, it is also extremely valuable in making the decoder work for a variety of signal sets. The branch metrics (external or internal) are then used by the decoder to determine the maximum likelihood sequence.

In order to make a binary decoder work for a variety of signal sets it is preferable to use externally generated metrics, calculated especially for the channel to be used. If external metric inputs are not available a workable compromise can be accomplished by making special use of the soft decision inputs. The use of hard decisions will result in suboptimal performance on the BPSK and QPSK channels, but will result in unsatisfactory performance on the 8-PSK and 16-PSK channels. The key to making the system work is to understand that the decoder will perform well as long as the soft decision inputs or externally supplied branch metrics are reasonably accurate indications of the symbol likelihoods. Since the decoder requires the metrics to be discrete, a reasonable approach is to quantize the signal set space, then assign a pair of soft decision weights or a branch metric to each quantization point. An implementation using phase quantization and soft decision weights is
presented in [8, 9, 10]. The system described in this paper uses 4-bit quantization of the I and Q components of the signal vector.

The multimode decoder is shown in figure 12. Pragmatic TCM, in any of the modes described earlier is transmitted over an additive white Gaussian noise channel, and received by a demodulator with 4-bit (16-level) quantized outputs for the I and Q components. The I and Q components are either fed directly to the soft decision inputs (for BPSK or QPSK operation) or used to determine the branch metrics (for 8-PSK or 16-PSK operation). The inputs M1 and M0 select the mode of operation: 00=BPSK, 01=QPSK, 10=8-PSK, and 11=16-PSK. The mode select inputs determine whether the metric and outboard decision units for 16-PSK or 8-PSK will be enabled. If the BPSK or QPSK mode is selected, XSEL (external metric select) is non-asserted, meaning that the decoder will use the soft decisions, which are fed directly from the quantized I&Q components. If the BPSK mode is selected SEQ (sequence) is asserted, meaning that the two code bits are received in series; in all other modes, SEQ is non-asserted, meaning that the two code bits are received in parallel. The outboard decision requires the decoded sequence, as well as the location of the received vector. Because a Viterbi decoder has a latency period of between 35 and 80 symbols, depending on the specific decoder model, the vector used to make the outboard decision must be delayed to match the data delay introduced by the Viterbi decoder. The Viterbi decoder is common to all four modes of operation, but the metrics and the outboard decisions are specific to the signal set. For BPSK and QPSK, the soft decisions on I and Q serve as the optimal metric. For 8-PSK and 16-PSK, special branch metrics be provided. Outboard decisions are applicable to 8-PSK and 16-PSK only.

The branch metrics and outboard decisions are most readily implemented by letting the quantized I and Q components address lookup table ROM's. Because there are 16 levels of I and 16 levels of Q, each ROM must have 256 addresses. Since a 4-bit metric must be provided for each of four branches, the metric table must be 16-bits wide. A table must be provided for both the 8-PSK and 16-PSK mode of operation. The outboard decision table also requires 256 addresses, with a width of 8 bits for 16-PSK and 4 bits for 8-PSK. This is because an outboard decision (2 bits for 16-PSK, 1 bit for 8-PSK) is made from each of the four partitions, then when the most likely partition is determined, after Viterbi decoding, the system selects the outboard bits which were determined from the maximum likelihood partition.
The bit error rate performance of the multimode system is illustrated in figure 13. The BPSK/QPSK curve is as given in the manufacturer's data sheet for the Viterbi decoder, the performance of the 8-PSK and 16-PSK modes was determined by simulation. The 8-PSK and 16-PSK modes are affected by the 16-level quantization of the I and Q components, as well as the 16 level quantization of the branch metrics, an effect which the simulations were designed to reflect. Ideally, the Viterbi algorithm would use continuous I/Q and continuous metrics, but the improvement of 0.2dB over 4 Bit quantization does not justify the computational complexity. The multimode system, consisting of the existing standard Viterbi decoder, and a small amount of additional hardware achieves meaningful coding gain in all modes of operation. At a bit error rate of $10^{-5}$, coded 16-PSK gains about 3.1dB over uncoded 8-PSK. At a bit error rate of $10^{-5}$, coded 8-PSK gains essentially 3dB over uncoded QPSK. Thus it can be seen that the pragmatic approach to TCM can lead to decoding schemes with minimal hardware complexity that address a wide range of applications.

Section 2 - Carrier Tracking Loop Squaring Loss

One of the more popular techniques for analyzing the jitter performance of carrier tracking loops is to linearize the loop and evaluate the resulting variance of the phase error. For the MAP loops of interest in this paper the resulting phase error variance can be expressed as

$$\sigma_\phi^2 = \frac{\text{No}}{2\text{Es}} \frac{\text{BL}}{\text{SR}} \cdot \frac{1}{\text{SL}}$$

(A-1)

where Es/No is the ratio of the energy per symbol to SR is symbol rate noise spectral density, BL is the loop noise bandwidth, and SL is the "squaring loss" of the phase detector.

The "squaring loss" (the term was originally applied to BPSK) is the increase in phase jitter within the loop over a conventional PLL of the same bandwidth due to the nonlinearity involved in the phase detection process, i.e. the phase detector output PD is given by
PD(\(\phi\)) Q_h I - I Q_h \quad (A-2)

where, I & Q are the analog outputs of the I & Q channel matched jitters, I_h & Q_h are the hard decision channel outputs and \(\phi\) is the phase error.

The squaring loss for BPSK and QPSK can be found in the literature [11, 12]. For 8 and 16 PSK no analytical result is readily available. The squaring loss is generated from two physical actions and these are:

(1) The phase detector gain (even at constant signal levels) depends upon the SNR through I_h and Q_h and goes down as the error rate goes up. This increases the jitter in the loop because there is less signal to track at a given SNR.

(2) The variance of the equivalent noise term in the loop is affected by the presence of errors also. Generally, this effect lessens the phase error by a slight amount.

The affects upon the variance of noise are very secondary, as we will soon show for a QPSK loop. It can be shown by linear loop analysis that the squaring loss neglecting the effects of errors on the noise term is given by

\[ SL \,(SNR) = \frac{1}{G^2} \]

where is the gain of the phase detector at zero phase error normalized to one at high SNR.

The model used for evaluating phase detector gain is shown in figure A-1. With this model I & Q are ready shown to be independent gaussian random variables with statistics given by

\[ \mu_I = \cos(\theta m + \phi) \quad \sigma_I^2 = \frac{N_0}{2E_S} \]

\[ \mu_Q = -\sin(\theta m + \phi) \quad \sigma_Q^2 = \frac{N_0}{2E_S} \]
The phase detector characteristic, PD(\(\theta\)), is the expected value of the error signal shown in Figure A-1, i.e.,

\[
PD(\varphi) = E(I\sin\theta_h - Q\cos\theta_h)
\]  \hspace{1cm} (A-3)

To see how the phase detector works consider the no noise case and a static phase error of \(\varphi\) which is small compared to \(2\pi/M\). Then

\[
I = \cos (\theta_m + \varphi)
\]

\[
Q = -\sin (\theta_m + \varphi)
\]

\[
\theta_h = \theta_m
\]

and PD(\(\varphi\)) is given by

\[
PD(\varphi) = \cos (\theta_m + \varphi) \sin \theta_m + \sin (\theta_m + \varphi) \cos \theta_m
\]

or

\[
\theta_m = \frac{2\pi}{M} m \quad m = 0, 1, 2, \ldots, M-1
\]
PD(\(\varphi_0\)) = \sin(\varphi) \text{ for no noise.} \quad (A-5)

Of course, if the phase error is larger than \(\pi/M\) then \(\theta_h\) will be equal to the value of modulation phase nearest to \(\theta_m + \varphi\) and thus the phase detector characteristic is periodic in \(\varphi\) with period of \(2\pi/M\).

The phase detector characteristic in A-3 can be evaluated by averaging over the noise and the data as follows

\[
PD(\varphi) = E \left( \left[ \cos(\theta_m + \varphi) + N_1 \right] \sin \theta_h + \left[ \sin(\theta_m + \varphi) + N_Q \right] \cos \theta_h \right)
\]

or

\[
PD(\varphi) = E \left( \sin(\varphi - \frac{2\pi i}{M}) \right) \quad (A-6)
\]

where \(i\) is the hard decision at the output of the polar estimator. It follows that

\[
PD(\varphi) = \sum_{i=0}^{M-1} Pr(i|0) \sin(\varphi - \frac{2\pi i}{M}) \quad (A-7)
\]

where we have fixed the transmitted data symbol at zero since \(PD(\varphi)\) is totally symmetrical with respect to transmitted symbols and \(Pr(i|0)\) is the probability of a hard decision on phase being in the \(i\)th sector given \(\theta_m = 0\). This expression can be evaluated for all forms of MPSK using the fact that the density function of phase is given by [13]

\[
r(\varphi) = \frac{1}{2\pi} e^{-\frac{Es}{No}} \left[ 1 + Z \sqrt{2\pi} e^{\frac{Z^2}{2}} Q(-Z) \right]
\]

\[
Z = \sqrt{\frac{2Es}{No}} \cos(\varphi) \quad (A-8)
\]

and performing the indicate integration of A-8 numerically.

A typical set of phase detector characteristics are shown in Figure A-2 for an 8PSK loop. The periodicity and the gain reduction with decreasing SNR are apparent.
The gain of the phase detector is obtained by calculating the slope of PD(\(\varphi\)) at \(\varphi=0\). The resulting squaring loss for QPSK is shown in figure A-3 for this technique and a more exact analysis from [12].

As discussed previously the squaring loss given by our approximation and the result including the effects of noise correlation are within 5dB in \(E_s/N_0\) of each other for all reasonable values of \(E_s/N_0\). The squaring loss calculated by this technique for all 4 loop types is shown in figure A-4 this result is used in the section on carrier tracking to provide a theoretical estimate of the loop phase jitter.
REFERENCES


