Performance evaluation of block acquisition and tracking algorithms using an open source GPS receiver platform

Ganesh K. Ramachandran, David Akopian
The University of Texas at San Antonio
Gregory W. Heckler, Luke B. Wintemitz
NASA Goddard Space Flight Center

BIOGRAPHY

Ganesh Kumar Ramachandran received his M.S degree in Electrical Engineering from The University of Texas at San Antonio, in 2010 and B.E. degree in Electrical and Electronics Engineering from Anna University, India, in 2006. Prior to joining UTSA he was a Junior Test Engineer at M/s Quintegra Solutions Ltd for 2.5 years. His current research interests include real-time software GPS receiver implementation on various open source platforms and receiver testing modes.

David Akopian is an Associate Professor at the University of Texas at San Antonio (UTSA). He joined the UTSA in 2003 where he founded Software Communication and Navigation Systems Laboratory. He received the M.Sc. degree in radio-electronics from the Moscow Institute of Physics and Technology in 1987 and Ph.D. degree in electrical engineering from the Tampere University of Technology (TUT), Finland, in 1997. From 1999 to 2003 he was a Senior Engineer and Specialist with Nokia Corporation. Prior to joining Nokia in 1999 he was a member of teaching and research staff of TUT. His current research interests include digital signal processing algorithms for communication and navigation receivers, positioning methods and mobile applications.

Gregory W. Heckler is an aerospace engineer in the Communications Systems Branch (Code 567) at NASA Goddard Space Flight Center. He holds a B.S. and M.S. degree in Aerospace Engineering from Purdue University. His current research interests include GPS space applications and software radio.

Luke B. Wintemitz is an electrical engineer in the Component and Hardware Systems Branch (Code 596) at NASA Goddard Space Flight Center in Greenbelt, Maryland. He has worked at Goddard for nine years primarily in the development of GPS receiver technology. He received Ph.D. degree in electrical engineering from the University of Maryland, College Park, in 2010. His research interests include GPS space applications and software radio.

ABSTRACT

Location technologies have many applications in wireless communications, military and space missions, etc. US Global Positioning System (GPS) and other existing and emerging Global Navigation Satellite Systems (GNSS) are expected to provide accurate location information to enable such applications. While GNSS systems perform very well in strong signal conditions, their operation in many urban, indoor, and space applications is not robust or even impossible due to weak signals and strong distortions. The search for less costly, faster and more sensitive receivers is still in progress.

As the research community addresses more and more complicated phenomena there exists a demand on flexible multimode reference receivers, associated SDKs, and development platforms which may accelerate and facilitate the research. One of such concepts is the software GPS/GNSS receiver (GPS SDR) which permits a facilitated access to algorithmic libraries and a possibility to integrate more advanced algorithms without hardware and essential software updates. The GNU-SDR and GPS-SDR open source receiver platforms are such popular examples.

This paper evaluates the performance of recently proposed block-correlator techniques for acquisition and tracking of GPS signals using open source GPS-SDR platform.

I. INTRODUCTION

In conventional GPS [1,2], satellites orbiting the earth transmit direct sequence spread spectrum (DSSS) ranging signal on the carrier frequencies of L1 (1575.42 MHz) and L2 (1227.60 MHz).

The GPS receiver measures the time-of-transmission (TOT) of the satellite ranging code. Based on this and its own time, it determines the time required for the signal to propagate from the satellite to the receiver. This time is converted to a distance when it is multiplied by the speed of light. The receiver also decodes the navigation data from satellite messages, which contains orbital parameters, correction data, time stamps, etc. Using navigation data and transmission times the receiver estimates the locations of GPS satellites. Having the satellite locations, and satellite-to-receiver distances, one can unambiguously determine the position of the receiver using trilateration techniques. The estimation can also resolve for the receiver clock error as it is generally...
biased. Least Squares (LS) or closed form methods can be used to solve GPS trilateration equations.

In the conventional civilian receiver each satellite modulates the sinusoidal L1 carrier signal with a unique coarse acquisition (C/A) code known at the receivers. The C/A code is a binary pseudorandom noise (PRN) code sequence comprising 1023 chips repeating each millisecond. The C/A code is 1ms long containing values of -1 and +1.

The L1 signal is further modulated with the navigation data at a bit rate of 50 bit/s. Thus, the transmitted signal is a product of a data component, a PRN code component and a sinusoidal carrier component. The primary measurement tasks at the receiver include calculation of a range (satellite-to-receiver distance), range rate, and demodulation of the navigation data. For range and range-rate measurements the receiver has to synchronize locally generated `replica' code with the received signal in order to de-spread the code and estimate delays. The synchronization is done in two phases; coarse synchronization (acquisition) and fine synchronization (tracking).

In both modes, correlators are used to find the best alignment of the received signal and replica code sequence and thus to find their relative signal shift (delay) called `code-phase'. The notion of the code-phase is due to the DSSS signal structure, which has the same pseudorandom signal pattern periodically repeating in time. The signals are aligned when the edges of the code periods are aligned. Figure 1 illustrates the structure of a conventional GPS receiver. Acquisition and tracking modules may reuse correlators, and most of the state-of-the-art receivers use dedicated hardware accelerators for the correlators. With the development of faster processing units and more reconfiguration capability needs the correlators can be completely implemented in software, a concept known as software defined radio or receiver (SDR) [8]-[11].

Dedicated hardware has clear advantages in faster processing but the functionality is limited to the particular design of each chip. Software implementations are becoming more and more attractive due to their flexibility to adapt to GPS signal modernization, new Global Navigation Satellite System (GNSS) signals [2], and demand of multimode weak signal processing algorithms [2],[3] to handle many possible scenarios and multi-sensor integrations. Examples of software GPS receiver implementations are [4]-[8]. Tight integration of baseband and positioning algorithms of the GPS receiver along with other sensors is a big advantage of software receivers.

Software implementations may reduce the cost of the positioning systems as new versions can be simply installed as software upgrades.

To make use of the advantages and potential of software implementations, computationally efficient algorithms should be used to implement massive correlators employed in the state-of-the-art receivers for high sensitivity. This is a very big challenge for software implementations. GPS receivers process signals in three dimensional uncertainty space: `available satellites', 'code-phase of each satellite', and `Doppler frequency shifts'. Doppler shifts are due to relative satellite-receiver movement and clock inaccuracies. The Doppler shift causes changes both in code rate and carrier frequency. For example, in terrestrial applications, the maximum change in the apparent received carrier frequency due to Doppler is estimated to be less between ±1kHz.

This multidimensional search requires significant computational resources. In addition, unlike conventional software communication systems, positioning receivers deal with weak signals and need long integration times (on the order of seconds) to detect signals in difficult environments. Thus, new algorithms with significantly reduced computational complexities are required for “software” implementations.

Recently, block-correlator algorithms have been suggested for drastic computational reductions both in the frequency [12], [13] and time domain [14]. In these algorithms, many operations are shared and computational redundancy is minimized, resulting in significantly faster processing. Due to computational architecture specifics, the arithmetic operations reduction may not necessarily result in faster processing, as it depends on many factors related to data flow. This paper describes implementations of two block-correlator concepts for acquisition and tracking on an open source platform to validate theoretical performance acceleration estimates on a real-time GPS receiver platform. It is demonstrated that indeed this performance improvement is achieved as compared to alternative methods.

The paper is organized as follows. Section II concisely presents the GPS-SDR open source software receiver platform. Section III describes fast FFT-based block-correlator for signal acquisition. Section IV provides a
time-domain block-correlator for tracking. Algorithm performance results for GPS-SDR receiver are shown in Section V and conclusions are made in Section VI.

II. GPS-SDR OPEN SOURCE PLATFORM

The GPS-SDR [8] open source project is a popular real-time C/C++-based GPS receiver which can be used to evaluate the overall performance of the block-correlator algorithms by replacing appropriate blocks. The GPS-SDR is compatible with popular RF front-ends, [11] and SiGe [4]. Real-time signal processing is achieved through the use of carefully coded low-level processing routines. The GPS-SDR is highly modular and multithreaded enabled. Currently it processes GPS L1 C/A signals, and can handle certain weak signal conditions. This receiver runs on a PC/laptop which connects to the front-end through a USB 2.0 port. Other extensions such as L2 signal processing are being implemented. The receiver has a built-in GUI that allows the user to conveniently interact with the receiver during the operation. This receiver also has advanced capabilities to determine the velocity and time information during motion. The acquisition unit of the receiver is designed and developed for both strong and weak signal acquisition with the capability for both coherent and non-coherent integration over different GPS signal durations of GPS signal. For faster processing, the original correlations are performed using assembly level functions to meet critical processing speeds for real-time receiver operation. Figure 2 illustrate the architecture of the modified GPS-SDR project with incorporated new advanced acquisition and tracking modules highlighted. While many components of GPS-SDR employ conventional approaches, novel technical solutions such as optimized FIFO make the real-time processing feasible.

III. FAST BLOCK-CORRELATOR ALGORITHM FOR ACQUISITION

The acquisition is the first step of the synchronization and it is typically the most computationally intensive stage as compared to tracking. Block correlators algorithms are proposed to reduce arithmetic complexity in [12],[13]. Both approaches employ two stage Doppler frequency compensation: (1) coarse frequency compensation in integer kHz steps and (2) sub-kHz fine frequency compensation. The GPS-SDR platform originally implemented the approach in [13] in which fine Doppler frequency shift compensation is implemented after the correlators. This results in correlation peak degradations

Figure 2. New GPS-SDR Architecture [8]

Figure 3. (Left) Block diagram of Fast FFT based Acquisition [12]; (Right) Correlation peak using of Fast FFT based acquisition technique

Comment [LBW22]: Should define all acronyms when first used.
As a result, the coarse frequency spacing needs to be more dense, e.g., about 250 kHz or 500 kHz to be able to acquire weak signals. This paper employs the method introduced in [12] (Figure 3) as it is free from peak degradation phenomena (Figure 4b).

Computational savings are achieved through block-processing by optimizing the coherent processing stage. Initially, a coarse Doppler frequency compensation is performed with frequencies as integer kHz. This can be performed by multiplying the input with a sinusoid at a compensating frequency (e.g., \( p \) kHz, \( p = -10, -9, \ldots, 9, 10 \)), or equivalently by cyclically shifting the FFT of the replica signal by \( p \) samples. Sub-kHz Doppler frequencies are processed jointly. Let input samples \( x_n \) be arranged in a matrix \( X \) filled column-by-column. In other words, the matrix \( X \) contains elements \( x_{n,n_2} = x_{n_1,n_2} \), where \( n_1 = 0, \ldots, N_1 - 1 \) and \( n_2 = 0, \ldots, N_2 - 1 \). \( N_1 \) is the code period length in samples, and \( N_2 \) is the number of coherently combined code periods. The frequency resolution will be \( 1/N_2 \) kHz. Combining multiple code periods of the input signal coherently with Doppler compensation can be performed as:

\[
Z^{coh} = C \ast (XF^T)
\]

where \( F \) is the DFT matrix notation \( \ast \) denotes element-wise multiplication, and \( C \) has elements

\[
C_{n,k} = e^{-j 2\pi \frac{n_k}{N_2}}
\]

Columns of the matrix \( Z^{coh} \) are coherently combined epochs with a candidate Doppler frequency wiped-off. Each column corresponds to a certain frequency from the group of frequencies defined by the index \( k \):

\[
f_k = f_s \frac{k}{N_2}, \text{ where } k = 0, 1, \ldots, N_2 - 1, \text{ and } f_s \text{ is the sampling frequency.}
\]

Equation (1) states that all the frequencies corresponding to the values of index \( k = 0, 1, \ldots, N_2 - 1 \) are processed jointly using the DFT.

Each of the columns of the matrix \( Z^{coh} \) is then correlated with the replica code at all possible code phases. In the frequency domain, using the convolution theorem, one can obtain for the output of the coherent stage

\[
z^{coh} = R_N \ast (XZ^{coh}F_N C \ast \left[XF^T_N \right])
\]

Here, \( R_N \) is a diagonal matrix, \( \text{diag}(R_N) = 1/N \). \( F_N \) is the DFT matrix of size \( N \). It can be shown that in (3), the fragment \( R_N F_N C \ast \left[XF^T_N \right] \) can be implemented by a single DFT (implemented using the FFT algorithm). Thus one FFT is used for a joint processing of multiple Doppler frequencies and code-phases. The overall processing structure is shown in Figure 3.

IV. A FAST BLOCK-CORRELATOR ALGORITHM FOR TRACKING

The tracking loop follows the incoming signal and adjusts itself to de-spread and de-modulate the incoming signal. Two tracking loops are used to track the incoming GPS signal: a delay-locked loop (DLL) to hack the code and a frequency (FLL) and/or phase locked loop (PLL) to track the frequency/phase of the incoming signal. See Figure 5. Here, the DLL consists of early, prompt, and late code generators, filters and discriminators. The early and late codes are half a chip (or less) time shifted versions of the prompt code. The incoming signal is correlated with early
and late C/A codes to produce two outputs which are fed to a discriminator. A control signal is generated based on discriminator's output to adjust the rate of the locally generated C/A code to match the C/A code of the incoming signal. Different discriminators can be used to address multipath and non-triangular autocorrelation shapes.

The navigation data is finally extracted by de-spreading the received GPS signal with the locally generated prompt code. Advanced DLL tracking loops may use more correlators to address multipath and non-triangular autocorrelation shapes.

Figure 5. Block diagram of the combined tracking loop.

As for the frequency tracking: phase and/or frequency locked loop (PLL/FLL) can be used [1],[2]. Conventional PLL loops generate a local carrier signal that is driven to alignment with the incoming signal. In the block correlation approach there is no need to generate the local carrier for every incoming sample. Sine and cosine tables are generated and stored in memory once, and using the feedback from the PLL loop, the local sinusoid is designed to neither slow nor fast [15]. For the GPS-SDR, a Costas Loop is used (Figure 6) for the implementation of a PLL feedback loop. Costas loops are insensitive to ISO phase transitions due to navigation data bits and they are typically preferred choice in GPS receivers.

![Figure 6. Costas PLL for carrier tracking](image)

In the tracking block-correlator all three correlations are computed together. As opposed to conventional correlators, block correlators use fixed groups of replica sequences. The errors from the tracking loop just switches replica selections from one group to another when needed.

The joint block correlation is described next following the original approach in [14]. For performing correlation operations, samples of the received signal are stored in memory. This set of samples is denoted as: \( s_{-1}, s_0, s_1, \ldots, s_{N-1} \), where \( N \) is the number of samples. In this paper we compute \( K \) correlations in block processing mode. The replica code sequence is set as \( r_{-1}, r_0, r_1, \ldots, r_{K-1}, r_K \), where \( k \in \{0, \ldots, K-1\} \) identifies the respective replica code sequence and \( r_k \in \{-1, +1\} \).

Denoting the consecutive correlation values as \( C_k \), the operation of conventional correlators is defined as:

\[
C_k = \sum_{j=0}^{N-1} r_k x_j , \quad k \in \{0, \ldots, K-1\} .
\]  

For the block processing method (Figure 7), the received samples which multiply the same set of replica samples in a group of correlators are grouped together. Let the index variable \( j \) identify received samples and \( J_{b_k} \) be the set of indices belonging to the same group. Formally \( j \in J_{b_k} \) if \( \{r_{j'}, \ldots, r_{j''}\} = \{b_1, \ldots, b_K\} \), \( b_k \in \{-1, +1\} \). There are \( 2^K \) such groups. Then (4) becomes

\[
C_k = \sum_{j \in J_{b_k}} \sum_{k=0}^{K-1} r_k x_j .
\]

The fast block processing algorithm is based on the idea that samples from each group will be used only once for computing sub-sums \( S_{b_{k_1} \ldots b_{k_{K-1}}} = \sum_{j \in J_{b_{k_1} \ldots b_{k_{K-1}}}} x_j \). These sub-sums are then used for computing \( K \) correlations. The number of additions is reduced almost \( K \) times if the number of sub-sums is significantly less than the number of samples. For \( K = 3 \) there are eight groups \( J_{b_1 b_2 b_3} \) to \( J_{+1+1+1} \), each identifying a group of samples. A register is assigned to each group to store sub-sum \( S_{b_1 b_2 b_3} \). All such registers are also indexed as \( (b_1, b_2, b_3) \). For illustration purposes our figures also use an equivalent notation for \( (b_1, b_2, b_3) \) where signed binary values of \( b_k \) are replaced with \((0, 1) \) binary values or \((b_1, b_2, b_3)\) is replaced by an integer. Example: \((+1, -1, +1) \rightarrow (1, 0, 1) \rightarrow 5 \). Here \((1, 0, 1) \) is the binary codeword of \( 5 \).

For calculating the sub-sum \( S_{b_1 b_2 b_3} \) for each group and for each of the correlator iterations, all \( 2^K \) registers are initialized to zero. Then, the algorithm processes the stored received samples \( x_j \) one after the other forming the sub-sums. In Figure 7a, one of the received samples is...
denoted as \( x_n \). The samples of the three replica code sequences having the same index \( n \) are \( r_1^n = -1 \), \( r_2^n = +1 \), \( r_3^n = -1 \). These samples correspond to the register address (010) or “2”. The adder adds the received value \( x_n \) to the sub-sum \( S_2 \) and stores the new value of sub-sum \( S_2 \) into the register again. This procedure is performed analogously for all \( N \) received samples. The combining unit of Figure 7a then combines all stored sub-sums \( S_b \) to obtain the \( K \) correlation values \( C_k \) at once as a result of block processing. Note that multiplications \( b_k S_{b_i} \) are just sign changes as \( b_k \) is either +1 or -1.

Figure 7a illustrates an example of how the sub-sums are formed while Figure 7b presents an example how the sub-sums are combined to produce three parallel outputs corresponding to the correlations with three parallel replicas. In the example of Figure 7, correlation values \( C_1, C_2, C_3 \) have to be calculated for \( K = 3 \) replica code sequences, and the number of groups is eight. The complexity reduction estimate is about 3 times for three joint correlations. The integration of the block correlator into the tracking loop is illustrated in Figure 8 [15].

The block-correlator described above has been integrated into the GPS-SDR. Once satellites are acquired, the initial estimates of code phases and Doppler frequencies of each satellite are provided to the tracking channels dedicated to individual satellites. The tracking loops continuously track the variations in received frequency and code phase due to the line of sight (LOS) dynamics between the GPS satellites and the receiver.

To implement the code-phase corrections, the GPS-SDR’s original implementation has several versions of replicas corresponding to different code phases. Depending on the DLL output error, the appropriate replica is selected and multiplied against the next 1 ms signal fragment. The next fragment is chosen according to the code phase error obtained after tracking the previous 1 ms duration of the GPS signal. To validate the block correlation algorithm with the GPS-SDR, binary indices described in the algorithm itself are created for all the versions of original replicas. Later, the binary index values for a particular replica version is chosen for tracking using block correlation algorithm according to the code phase error.

V. PERFORMANCE EVALUATION

The block correlator requires only additions since all the multiplications in the equations can be implemented with sign changes. The computational performance of block-correlators for acquisition and tracking are evaluated in the GPS-SDR. The GPS-SDR receiver is implemented in C++ with some lower level functions such as the FFT and 1ms correlations implemented in assembly code. The same assembly level FFT is used for the acquisition block correlator.
The performance of the algorithms described in this paper have been estimated using the GPS-SDR running on a PC with the following parameters: Intel Core 2 Duo Processor, 2.00GHz clock, 2GB RAM, LINUX-UBUNTU OS. ‘OProfile’ profiling tool is used for CPU load estimations [16]. ‘OProfile’ is a system-wide profiler for Linux systems with a capability of profiling all codes with little overhead [16]. ‘OProfile’ is released under the GNU GPL, and sample data are collected using a kernel driver and a daemon. The collected data are later converted to useful information using several post-profiling tools of ‘OProfile’, ‘Ocontrol’ and ‘Oreport’ scripts are mainly used to profile the C++ codes using ‘OProfile’. The ‘Ocontrol’ script is run to: start profiling, and a profiling session, dump profile data, and set up the profiling parameters. The ‘Oreport’ script is used to output binary image summaries, or per-symbol data from ‘OProfile’ profiling sessions, which includes the CPU percentage timings of processes running in the PC.

Table 1. ‘OProfiler’-based ‘CPU Load’ estimations for acquisition

<table>
<thead>
<tr>
<th></th>
<th>Block-cotrelator [12]</th>
<th>Conventional Block-cotrelator [13]</th>
</tr>
</thead>
<tbody>
<tr>
<td>CPU Load (% of acquisition as a fraction of overall GPS-SDR load)</td>
<td>30.6%</td>
<td>17%</td>
</tr>
</tbody>
</table>

Table 2. ‘OProfiler’ CPU average load estimations for tracking

<table>
<thead>
<tr>
<th></th>
<th>Block-cotrelator [14]</th>
<th>Conventional Correlator, Assembly</th>
<th>Conventional Correlator, C++</th>
</tr>
</thead>
<tbody>
<tr>
<td>CPU Load</td>
<td>11%</td>
<td>17%</td>
<td>33.79%</td>
</tr>
</tbody>
</table>

The performance improvement is partially explained by the profiler in offline mode for the duration of acquisition. One can apparently see the efficiency of the block correlator from [12].

The performance evaluation of the tracking algorithms was performed on captured data (up to 30 seconds) for 12 channels on the same PC. For 30 seconds of recorded signal, the computation time of the block-cotrelator approach [16] was found to be 13.25 seconds compared to the original GPS-SDR’s conventional correlation performed in assembly level, which was about 19 seconds. The conventional correlation performed in C++ language was about 23 seconds. Figure 10 demonstrates the performances for various signal durations. Table 2 shows CPU loads by the block-cotrelator approach (about 11%), the conventional correlation implemented in C++ (about 33.79%) and GPS-SDR assembly code implementations (about 17%).

VI. CONCLUSIONS

This paper describes the implementation of two recently proposed block-correlators for acquisition and tracking on the open source GPS receiver platform GPS-SDR. For acquisition, computational gains are achieved using special FFT based processing. For tracking, a time-domain block correlator exploits joint processing of early, prompt and late correlations. Performance benchmarking results demonstrate significant improvement over conventional tracking correlators and an alternative block-cotrelator acquisition algorithm.

REFERENCES


