MARK III CORRELATOR HARDWARE AND SOFTWARE

Alan R. Whitney

NEROC Haystack Observatory

ABSTRACT

The Mark III correlator system is based on a modular philosophy in a manner similar to the Mark III data acquisition system. Each “correlator module” independently processes the data from one track pair. Therefore, 28 modules are necessary to complete a full one-baseline processor and 84 modules for a full 3-baseline processor. Each correlator module has two interfaces: (1) data and clock from each of the two tracks to be correlated and (2) Computer Automated Measurement and Control (CAMAC) dataway interface to the computer. The processor is organized around the IEEE “CAMAC” standard architecture, housing 15 correlator modules in each of 6 “crates.” This allows one-pass processing of a full 3-baseline 28-track observation or a 6-baseline (4 station) 14-track observation. The correlator architecture allows easy expansion for up to 8 stations. The computer system is an HP1000 system utilizing a 16-bit minicomputer with disc and tape peripherals. The processing software is also organized in a modular fashion with many independent but cooperative programs controlling the operation of the Mark III processor. Processing time through the correlator is normally real time or faster, with graphics displays providing real-time monitor and control of the processing operation.
DESIGN PHILOSOPHY

In keeping with the philosophy of the Mark III data acquisition terminal, the Mark III correlator has also been designed as a highly modular system. In fact, the correlator system is modular to the extent that the entire correlator is made up of identical modules. Each correlator module is responsible for cross-correlating the data from one track-pair, so that 28 modules are required to process a single 28-track single-baseline observation.

BASIC CORRELATOR MODULE CHARACTERISTICS

Figure 1 shows a simplified view of the input/output interfaces of the correlator module. Input data to the module is made up entirely of the data and clock signals from each of the two tracks to be correlated. Approximately once every 2 seconds the host computer provides processing parameters to the module and receives the processed data from the previous approximately 2 seconds. As explained in the paper presented by Ed Nesman, this computer interface was selected to conform to the CAMAC standard.

Figure 2 lists the basic characteristics of the correlator module. As indicated in figure 1, the interface of the recorded data is very simple, consisting only of the data and clock signals from each of the two data streams to be cross-correlated. The X-clock signal is used as a basic clock within the correlator module, so that normal processing will take place regardless of the data rate from the tape recorders; in other words, no control signals indicating data rate to the module are necessary. The reproduce circuitry within the tape recorders is designed in such a manner that the clock signals will always "flywheel" even in the absence of data, so that processing will continue normally even through deep data dropouts.

The cross-correlator section of the module performs processing for 8 complex lags during normal processing or 16 real lags for auto-correlation processing. The accumulation registers are 23 bits wide, allowing the accumulation of slightly more than 2 seconds of data at the 4 Mbps sample rate.
Fringe rotation is accomplished through the use of a 24-bit phase register, thus achieving 0.4 micro-
radian resolution. A three-level rotation scheme is used similar to that used by the Mark I correla-
tion system to optimize the signal-to-noise ratio of the processed data. The phase-rate-resolution of
the rotator is selectable in powers of two under software control and is normally set to be 14.9 MHz
at the 4 Mbps sample rate. The selection of rotator phase-rate resolution must be traded against the
maximum fringe-rate that can be processed; selecting the resolution to be 14.9 MHz for the 4 Mbps
sample rate limits the maximum rotation speed to approximately 250 kHz. Higher fringe rates may
be processed by decreasing the phase-rate resolution at the possible expense of shortening the basic
correlator integration period due to degraded phase-tracking of the rotator. Also included in the
fringe-rotator section of the correlator module is a simple phase-acceleration compensation mecha-
nism which is designed to be sufficient to compensate for any expected fringe-phase acceleration
during the normal correlator integration period.

A 4000-bit addressable buffer is provided in the path of the Y-data and Y-clock signal streams so
that the X and Y tape drives need be synchronized only so well as to maintain the a priori bit delay
within this 4000-bit window. Synchronization of the tape drives to this level is a relatively easy task.

The basic integration period of the correlator module is software-controllable from 1 frame to 512
frames, or equivalently, from 5 msec to 2.56 seconds (tape-time) at 4 Mbps sample rate. For most
normal geodetic processing, an integration period of 1 to 2 seconds is used. This provides a search
range of 1.0 to 0.5 Hz, respectively, which is quite adequate for all normal processing. Since the computer must service the correlator module once each integration period, the load on the host computer varies approximately inversely as the integration period. As the integration period shortens, a point is reached where the number of modules that may be operating simultaneously must be reduced or tape-playback speed reduced in order that the computer may keep up.

Not indicated in figure 2 is the availability of a "pulsar" processing mode, which may be used for special applications. In this mode, a start bit number and stop bit number may be specified within an integration period so that only those bits within the start-stop window are processed. This mode is primarily designed for processing of pulsar data, but may be used to process any pulse-type data or for other special-purpose processing.

Phase-calibration (phase-cal) processing is also provided within the correlator module. Separate processors are used for the X and Y data streams, although the frequency of the phase-calibration rail must be the same for both X and Y. A much-simplified rotation scheme is used for phase-cal processing compared to the rotator described above. Rotator phase-rates are constrained to be such that the phase-cal signal must contain an integral number of sample bits per quarter-phase-cal-period. This condition is met with the phase-cal signals that are normally used. Also, only a two-level rotation scheme is used, since signal-to-noise of the phase-calibration signal is not normally of concern. By imposing these conditions and restraints, the circuitry used for phase-cal processing is considerably simplified compared to that of the main lobe rotator and cross-correlator.

Internal testing of the correlator module is possible through the use of an on-board quasi-random signal generator. Partial testing is possible off-line with no computer servicing necessary. Complete testing using the internal test generator is possible when a host computer is available.

SIGNAL FLOW THROUGH CORRELATOR MODULE

Figure 3 shows a block diagram of the Mark III correlator module. Both X and Y data streams are first decoded by their respective DECODER's. The X and Y data streams may be independently selected from any one of eight data streams which may be applied to the module (not shown in figure 3). The decoder has the responsibility of synchronizing itself to the data stream (by use of the sync words embedded in the data streams), extracting the time and auxiliary-data fields, checking the time/aux-data field for errors (using the embedded 12-bit CRCC check character), stripping parity bits and counting parity errors. The decoded X-data is then passed to the X PHASE-CAL DETECTOR and the X-DATA ROTATION-BLANKING. Rotation is controlled by the FRINGE RATE GENERATOR, which generates a three-level quadrature rotation signal according to the parameters received from the host computer. The quadrature-rotation signals from the rate generator are processed by the MODE SELECTOR before actually being applied to the X-data. The mode selector may apply an additional +/-90*n degree phase-shift in conjunction with a shift in the X-Y bit delay to accomplish "automatic fractional-bit-error correction" — more will be said about this later. After rotation, the X-data stream is applied to the CORRELATOR for cross-correlation with Y-data.
Figure 3. Mark III processor internal block diagram.

After emerging from the Y-decoder, the Y-data stream and accompanying flags are routed to a BUFFER MEMORY capable of storing 4000 bits of data. Data is removed from this buffer memory according to a BIT OFFSET COUNTER which has been set according to the a priori parameters received from the host computer, so that the proper Y-data bits are made available to the correlator. The output from the buffer memory is applied directly to the Y-PHASE-CAL DETECTOR for phase-cal processing, and to a 0-15 BIT PROGRAMMABLE DELAY preceding actual correlation with the rotated X-data. This programmable delay is used to shift the Y-data stream one-bit at a time to the X-data stream as the a priori delay changes within an integration period. Up to 15 such bit shifts may be made within a single integration period, according to the parameters provided by the host computer. Each time such a bit shift takes place, the MODE SELECTOR imposes an additional +/-90 phase shift to the signals emerging from the FRINGE RATE GENERATOR. Through the use of this “automatic fractional-bit error correction” technique, correlation may be carried through many changes in the X-to-Y bit delay so long as the model contained in the FRINGE RATE GENERATOR is sufficiently accurate to track the fringe phase.
The CORRELATOR section of the module does the actual correlation between the rotated X-data stream and the unrotated Y-data stream. Eight complex lags are normally processed, although the correlator may also be configured under software control to do auto-correlation processing of either the X or Y data stream.

The determination of tape-recorder synchronization is made by latching the Y bit count (within a frame) at the beginning of the first X frame of an integration period. Integration period boundaries are constrained to start at X-data frame boundaries, so that with knowledge of the Y-data stream time and LATCH COUNT at the beginning of an integration period, the tape-recorder synchronization may be computed and the necessary commands sent to the tape transports.

All communications with the host computer are “double-buffered” so that computer servicing of the correlator module is completely asynchronous with respect to the X and Y data streams except that servicing must take place once each integration period in order to update processing parameters and extract processed data. Each correlator module communicates individually with the host computer. During a normal computer service, approximately 21 bytes are sent computer-to-module and 114 bytes module-to-computer.

CORRELATOR SYSTEM CONFIGURATION

The Mark III correlator system as currently built is designed for one-pass processing of 3-station/28-track data or 4-station/14-track data. Ninety correlator modules are housed in a single rack of six CAMAC crates, fifteen modules/crate. Each of the six crates is identical and receives an identical set of data signals from the four tape transports distributed as follows:

Module 1 - Receives signals from both tracks #1 and #2 from each of the four tape drives (recall that each module may select X and Y from among eight inputs to the module).

Module 2 - Receives signals from both tracks #3 and #4 from each of the four tape drives.

Module 14 - Receives signals from both tracks #27 and #28 from each of the four tape drives.

Module 15 - This module is dubbed a “floating module” whose input may be selected to be from any track of any of the four tape drives. Track selection is actually made within the control electronics of the tape drive. Tape drive selection is made at the input to the correlator module itself.

Figure 4 shows the resulting correlator configurations for processing of typical 3 and 4 station data. For the case of 3-baseline/28-track data, each crate processes either the odd or even numbered tracks from one of the three baselines. The floating modules may be assigned arbitrarily for either
redundancy checking or for special processing. In the 4-station/14-track case, each crate is assigned to one of the six baselines and may process all even or all odd tracks, which is compatible with the standard 14-track mode of data recording. The floating modules may again be assigned as desired.

Due to the modularity of the correlator system, it is easily expandable in several ways. With the addition of an identical 6-crate correlator system, 4-station/28-track data may be processed in single-pass. These additional 6 crates may be handled either by the existing host computer or by an additional host computer, if necessary. Because the data interface to the tape transports is very simple, expansion of the system to a large number of stations is a straightforward matter. Correlator modules and host computers may be added to bring the system to any capability desired. Connecting a group of HP1000 computer systems into a multicomputer net for handling a large correlator system is straightforward and cost effective. High-speed data devices such as discs are shared among CPU's, so that all data from a particular processing pass will end up properly concatenated ready for further processing.

**SPECTRAL-LINE PROCESSING CAPABILITY**

Although the Mark III processor design is primarily oriented towards the processing of continuum data, provisions have also been made for processing of spectral-line data where many lags are desired. This is done by staggering the a priori delays between modules to process many lags. Using 90 correlator modules to process a single-track, single-baseline spectral-line observation, 60 modules may
be used to do cross-correlation (480 complex delays) and 15 modules each for X and Y autocorrelation (240 real delays). Similarly, a single-track 3-baseline observation may be processed using 22 modules on each baseline for cross-correlation processing (192 complex delays) and 6 modules for autocorrelation processing (96 real delays).

CORRELATOR COMPUTER SUPPORT

The Mark III correlator system is supported by an HP1000 computer system utilizing a Model 2117F CPU. The HP1000 is a 16-bit minicomputer system of modest size, presently incorporating 256 kbytes of semi-conductor memory and 135 Mbytes of high-speed disc storage. All computations and control for the Mark III correlator system are done using this computer. The interface to the correlator hardware is provided by a commercially-available CAMAC branch-driver. Interface to the tape transports is through a standard ASCII RS-232 interface to ASCII Transceivers in the tape transports. No special-purpose or in-house-designed interfaces have been used in the correlator system.

The HP1000 computer system is supported by a standard RTE-IV system supplied by Hewlett-Packard (HP). This system supports a full multiprogramming environment, allowing many concurrent processing tasks. Full 64-bit precision arithmetic is supported in hardware for all critical calculations and use of a microprogrammed FFT in fringe-search operations make processing speed quite high (competitive with Digital Equipment Corporation VAX for FFT's and common scientific functions).

The computer system is expandable in a modular fashion to meet the needs of a growing processor system. Multiple CPU's may be connected in a variety of configurations, sharing peripherals as desired. Such easy modular expandability is in keeping with the modular expandability of the processor itself.

CORRELATOR SOFTWARE SUPPORT

The correlator support software is also highly modular. Figure 5 shows a rough outline of data flow through the processor system. The correlator itself is supported by a program called COREL which is really a battery of approximately 10 interacting programs each controlling certain aspects of the processing. Setup for processing a particular observation is done completely through simple ASCII data files which may be generated manually by using the on-line editor or through software which generates the setup files automatically according to an actual observing schedule. All a priori calculations for processor operation are done in real time while the correlator is operating. An on-line graphics display provides for real-time monitoring and control of the processing operation. All raw correlator output data is output to disc to await fringe-search processing.
Fringe-search processing is done by program FRNGE, which collects all of the data from a particular observation, one baseline at a time, and does processing to estimate correlation amplitude, group delay, phase-delay rate, fringe phase, etc. The results of the fringe search are again stored to disc to await possible editing with program EDIT. EDIT allows viewing of the FRNGE output as a trouble-shooting aid; after processing by EDIT to remove bad data, etc., FRNGE may be run again. Finally, the edited (if necessary) output of FRNGE is transferred into the DATA BASE to await further processing for geodetic, astrometric, and astronomical parameters. It is important to emphasize that all of these programs may be run in a hands-on interactive mode when necessary or desirable to process troublesome data or do special processing in an efficient manner.

All data is archived to permanent mag tape libraries at two stages in the correlation processing. The raw output of COREL containing all raw correlation coefficients and a prioris is archived so that this data may be retrieved and FRNGE re-run if necessary or desirable. Also, all FRNGE output is archived in a form which EDIT may retrieve and review. This output includes correlation amplitude and phase on short time scales (typically 1 to 2 seconds) for each observation.

During normal geodetic correlation processing of 3-station/28-track data, the computer is occupied 25 to 50 percent of the time with real-time processor-related tasks. The remaining time may be used in any manner desired without interfering with the correlation processing. Normally this "leftover" time is used for fringe-searching on a run which has just completed correlation.