# Use of Commercial FPGA-Based Evaluation Boards for Single-Event Testing of DDR2 and DDR3 SDRAMs

R. L. Ladbury, M. D. Berg, E. P. Wilcox, K. A. LaBel, H. S. Kim, A. M. Phan, and C. M. Seidleck

**Abstract**— We investigate the use of commercial FPGA based evaluation boards for radiation testing DDR2 and DDR3 SDRAMs. We evaluate the resulting data quality and the tradeoffs involved in the use of these boards.

*Index Terms*—probabilistic risk assessment, radiation effects, reliability estimation, quality assurance, and radiation hardness assurance.

## I. INTRODUCTION

Single-event testing of double-data-rate (DDR) Synchronous Dynamic Random Access Memories (SDRAMs) poses many logistical and technical challenges. Because DDR SDRAMs are commercial and in demand for commercial electronics, even obtaining single memory chips poses challenges. The chips are packaged in flip-chip ball grid arrays (FBGA), which preclude front-side irradiation and require thinning for the beam to reach the sensitive volume from the backside. The stringent timing demands of these devices complicates the task of board/tester layout, as the signal traces must be chosen appropriately so that all signals meet timing requirements. The high operation speeds, high density and high susceptibility to multiple error modes further complicate everything from tester design to data analysis. Moreover, all these challenges are expected to worsen for future generations. Each new DDR generation may require a new tester incorporating a state-of-the-art (SOTA) field programmable gate array (FPGA) to test these parts at speed. Use of commercial evaluation boards for Single-Event Effects (SEE) testing poses interesting trade-offs for some of these issues. Such boards typically interface to commercial memory modules, which are widely available and inexpensive. Evaluation board layout is optimized to ensure proper signal timing, and the controller, typically a commercial FPGA, is chosen to match the memory data rate memory. On the other

Manuscript received August 1, 2013. This work was supported in part by the NASA Electronic Parts and Packaging program, the Defense Threat Reduction Agency, and BAE Systems in Manassas, VA.

R. L. Ladbury is with the NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA (phone: 301-286-1030; fax: 301-286-4699; e-mail: Raymond.L.Ladbury@nasa.gov).

K. A. LaBel is with the NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA.

M. D. Berg, E. P. Wilcox, H. S. Kim, A. M. Phan and C. M. Seidleck are with MEI Technologies, 7404 Executive PL, Suite 500, Seabrook, MD 20706.

hand, preparing devices on a memory module for access to test ions is more complicated than preparation of an individual chip. The intellectual property (IP) designed for the FPGA is often not optimized for SEE testing-a demanding application in which data are accessed on every clock cycle. Finally, the use of memory modules complicates the task of controlling power to the device under test (DUT). Thus, current limiting will be ineffective at circumventing single-event latchup (SEL), and if single-event functional interrupts (SEFI) require power cycling for recovery, the entire board must be power cycled, necessitating a time-consuming reload of configuration data to the FPGA and re-initialization of the tester and DUT. In this manuscript we discuss the use of commercial FPGAbased evaluation boards to test DDR2/3 memories, paying particular attention to the negotiation of the above mentioned trade-offs.

## II. TEST DEVICES AND EVALUATION BOARDS

We tested DDR3 M471B5773DH0-CH9 Dual In-line Memory Modules (DIMMs—each with 8 2-Gb FBGAs, K4B2G0846-HCH9 [1]). The tester was a Xilinx EK-V7-VC707-CES-G (Virtex-7 based) evaluation board [2] (see Fig. 1). The DDR2 test parts were 512 MB and 1 GB DDR2 200pin, DIMMs, M470T2863FB3-CE6 and M470T6464FBS-CE6, each with 4 or 8 1 Gb K4T1G084QF-BCE7 die [3]. We used a Xilinx HW-V5-ML506-UNI-G (VIRTEX5-based) evaluation board as the tester for the DDR2 DIMMS [4]. We chose these DIMMs because they contained the DDR2 and DDR3 die of interest to us. Figs. 1 and 2 show examples of the commercial evaluation testers used in this work.



Fig. 1: DDR3 evaluation board at the Texas A&M University test site, showing the thinned DDR3 Device under Test (DUT) and the Virtex-7 FPGA (under the fan) which controls the DDR3.

To be published in the Institute of Electrical and Electronics Engineers (IEEE) Transaction on Nuclear Science (TNS) Dec. 2013 and on https://nepp.nasa.gov.



Fig. 2: The Xilinx Virtex-5 tester has parts mounted on both sides of the board. Most of the parts are mounted on the (a)-side with the Virtex-5. The DIMMs are mounted on the (b)-side with few parts to obstruct ion beams incident at oblique angles.

One FBGA on each DIMM was thinned to between 120 and 200  $\mu$ m and the DIMM was mounted on the evaluation board. The tester (evaluation board + DIMM) was controlled by means of a computer via a National Instruments LabVIEW software interface, which controlled the tester power. As mentioned above, power to the DIMM is supplied via the

evaluation board, so there is no independent control of DUT power in the event of an overcurrent due to SEL or loss of control due to a SEFI. Thus, when a power cycle is required, the tester design must be reloaded into the configuration memory, and the DUT must be reprogrammed.

Thinning of parts was carried out using an Ultra-Tec precision milling machine. This operation posed significant challenges due to the fragility of the FBGAs mounted on the DIMMs. The yield was less than 50% for the DDR2 devices. The three parts that survived thinning had thicknesses of ~200, ~190 and ~140  $\mu$ m. The yield for DDR3 devices was slightly improved (~50%), but again, the thicknesses feasible were limited, especially if polishing of the die was needed, as for two-photon absorption (TPA) laser SEE testing. Thicknesses of DDR3 devices varied from 120  $\mu$ m (no polishing) to ~200  $\mu$ m (polished). These thicknesses were adequate for the 25 MeV/amu beam tune at the Texas A&M University Cyclotron Facility (TAMU), as well as for the two-photon laser system at the Naval Research Laboratory (NRL). Thickness varied less than 10  $\mu$ m over the die surface.

#### III. FROM EVALUATION BOARD TO TESTER

Although the SDRAM evaluation boards are designed to operate DDR2/3s, they are not optimized for SEE testing. Probably the most significant drawback of using the evaluation boards is the lack of ability to control power to the DUT. This would be a serious drawback for potentially SEL susceptible parts. However, recent test results [5, 6, 7] have not shown SEL susceptibility in DDR2 and DDR3 SDRAMS. Moreover, while the need to cycle power to the entire evaluation board and then reloading the test program, this can be accomplished in a matter of minutes. Finally, we decided that SEL susceptibility would disqualify a part from consideration, so it would not require a full characterization.

Significant modification to the evaluation board IP was also required. This posed significant challenges, as the need for the controller to operate independently (without a processor) and to sample data on every clock cycle required significant amounts of re-/design, and the timing of the IP proved to be fragile. Even the language of the IP could be an issue. The DDR2 board was designed using VHSIC Hardware Description Language (VHDL, where VHSIC=very-highspeed integrated circuits), while the DDR3 board was designed with Verilog. Often, FPGA designers only use one language or the other.

The yield issues for FBGA thinning discussed above posed a final challenge. Here, we were helped by the availability of the 25 MeV/amu tune at TAMU, since the greater range of these ions allowed less aggressive thinning of the parts. This, and a thinning/polishing strategy that used lower pressure and higher bit speed, allowed us to take multiple parts to each test.

Despite these challenges, the evaluation boards were a much more economical and rapid solution than developing a new dedicated FPGA-based tester. Moreover, as attested by the volume of data gathered (>230 runs for both DDR2 and DDR3 tests in 24 hours), the evaluation boards proved to be reliable test platforms.

## IV. DDR2 TESTING

We tested the DDR2s at TAMU using the 25-MeV/amu tune. Table I shows the ions used during testing, including their energies, ranges in Si and Linear Energy Transfer (LET) as they exit the beam pipe. The TAMU beam monitoring software estimates the LET at the sensitive volume, taking into account materials transited by the ions. All five ions were used for the DDR2 test. For each run, the thinned SDRAM was centered in the beam at the desired angle (tilt and roll), and the desired pattern was programmed into the memory and verified. Then the part was irradiated to the desired fluence or until it experienced a SEL, SEFI or other disruptive error. The errors were tallied during the irradiation for a dynamic test, and the errors were read at the end of the irradiation for a static test. Most of the testing was done dynamically, with a Counter pattern, where the memory contents were determined by the address in which they were stored. At the end of the test, the part functionality was verified, and run parameters and errors were recorded. Then the part was prepared for the next run. Testing was conducted at multiple angles of incidence when this was feasible (note: the thickness of some parts made testing beyond 45° to the normal impossible with Xe and Kr ions due to penetration range issues), and tilt and roll effects were compared for some ion/angle combinations. Most runs ended with a SEFI or a large block error that overflowed the First-In-First-Out (FIFO) memory. If the part recovered on its own, the error was called a block error. Otherwise, it was tallied as a SEFI, and in almost all cases, a hard reset (resynching of clock and reinitialization of the part) was required for recovery. In several cases, errors were found to persist in the DUT after the part was reset and reprogrammed. Most of these errors were due to stuck bits. However, in some cases, the continuing errors were due to a persistent SEFI that could only be cleared by cycling power to the affected device (and therefore to the entire tester). The SEE cross sections are plotted vs. effective LET, whether they conform to conventional effective LET dependence or not.

| TABLE I: I | ON BEAMS | USED FOR | TESTING A' | r TAMU |
|------------|----------|----------|------------|--------|
|            |          |          |            |        |

| Ion-Mass | Energy<br>(MeV | Range<br>(um) | Incident LET<br>(MeVcm <sup>2</sup> /mg) |
|----------|----------------|---------------|------------------------------------------|
| N-14     | 347            | 1009          | 0.9                                      |
| Ne-22    | 545            | 799           | 1.8                                      |
| Ar-40    | 991            | 493           | 5.5                                      |
| Kr-84    | 2081           | 332           | 19.8                                     |
| Xe-129   | 3197           | 286           | 38.9                                     |

### V. DDR3 LASER AND HEAVY-ION TESTING

We conducted initial SEE testing on the DDR3s using the TPA laser facility at NRL using a pulsed beam with a 1.26  $\mu$ m wavelength. For this test, the DUT was imaged with near infrared (NIR) light, and the laser was directed to the portion of the die we wished to test. Fig. 3 shows an image of a portion of the die with both memory (large rectangles) and control logic on the bottom of the picture. The DUT was

programmed, the laser fired, and the resulting behavior recorded. For some runs, we chose a region of the die and programmed the laser to fire at random points within this region. The random sampling provides a closer analog to heavy-ion testing and allows determination of the proportion of circuitry that exhibits errors. The most notable result of this test was evident at the test site-we were unable to induce upsets in the memory array portion of the circuit, even using high laser intensities. It is unclear whether this is attributable to the actual immunity of the memory cells, or whether the thickness of the die and possible small metal obstructions preclude placement of the laser beam spot in the sensitive volume of the DRAM cell. We also observed burst errors, block errors, as well as SEFIs. In most cases, a hard reset (resynchronizing clock plus reinitializing memory control logic) was required for recovery, although some errors required a power cycle, and a few recovered after a soft reset (reinitializing the chip with no resynchronization of the clock).

Heavy-ion testing at TAMU was carried out for the DDR2s, although we made some test modifications based on our experience testing the DDR2s (e.g., testing with light ions first to minimize stuck bits due to multiple ion strikes). We performed the test in dynamic mode using a Counter pattern. which yielded the most information during DDR2 postprocessing. Again, most runs ended in large block errors or SEFIs that required a hard reset for recovery. We observed some persistent SEFIs, where a power cycle was required for recovery. For ions that caused stuck bits, a rough tally was kept to monitor the health and performance of the DUT. Based on the experience with stuck bits in the DDR2, we began testing with Ne ions in order to better determine the LET onset for these errors. Since no errors were seen for Ne at normal incidence, we did not test with N ions. The DDR3s were significantly harder to all SEE than the DDR2s. Where possible (e.g., for Ne, Ar and Kr), runs were taken with ions incident obliquely on the die both for tilt and roll angles. We gathered over 230 data runs for the DDR3s.



Fig. 3: Infrared micrograph of memory arrays (large rectangles near the top of the image) and control logic (smaller features near the bottom) of the DDR3 SDRAM tested at the NRL laser laboratory.

## VI. DATA ANALYSIS

As expected for a SDRAM SEE test, data analysis was a complicated task. Although neither DDR2s nor DDR3s exhibited destructive failures, they did exhibit a full range of nondestructive SEE—Single-Event Upset (SEU), Multi-Bit Upset (MBU), Block/Burst errors, stuck bits and many SEFI modes. Even the DDR3s, which did not upset during laser testing, exhibited single-bit errors that are most easily understood as SEUs down to LET~3 MeV·cm<sup>2</sup>/mg.

These error modes had to be identified and isolated from each other to avoid contaminating error rate estimates. To do this, we had to define each error category:

- Stuck bit—a persistent single bit error fixed to the uncharged state that cannot be corrected even after a power cycle to the memory and so persists across at least two runs.
- SEU/MBU—a correctable single or multiple-bit flip.
- Block error—a series of errors in contiguous or related addresses (e.g., row or column error).
- Burst error—a rapid series of errors that may or may not occur at related addresses (identified by temporal proximity—tallied with block errors for convenience).
- SEFI—Complete or partial loss of functionality in the part due to an upset in the control logic or device timing that requires a reset of the DUT or a power cycle for recovery.

The analysis considered only the portions of the run prior to the occurrence of a SEFI. For this portion of each run, first the stuck bits were counted and removed. There could be at most one SEFI per data run. Given the definition of a stuck bit, this could only be done in post processing. Likewise we removed and tallied the block and burst errors. The remaining errors were tallied to determine SEU counts for each run. Tallies for each error mode were combined for runs carried out under similar conditions to minimize random errors in the cross section. The results for tilt and roll angles of equal magnitude were compared and found not to vary significantly, so, these runs were also combined. The analysis resulted in cross section vs. LET curves for four different error modes for both DDR2s and DDR3s-SEU, block errors, SEFI and stuck bits. For stuck bits, the probability of multiple hits to a single bit was estimated based on the total fluence of ions that had been incident on the part.

#### VII. DDR2 RESULTS

Figs. 4-7 show the cross section vs. LET curves for SEU, block errors, SEFI and stuck bits. SEUs and SEFIs were seen for all test ions, including N at normal incidence. The best fit onset LET was 0.6 MeV·cm<sup>2</sup>/mg for SEUs, 0.9 MeV·cm<sup>2</sup>/mg for SEFIs, and ~1.6 MeV·cm<sup>2</sup>/mg for block errors. The limiting cross section for SEUs was ~20× that for block errors, which was in turn about 12.5× the SEFI cross section. Moreover, these ratios persist over most of the range where errors were seen. This means that most runs had less than 100 SEUs before they were ended by a SEFI or large block error. No MBUs were observed. In Fig. 7, it is likely that most if not all of the stuck bits seen at low LET arise from bits that had been struck by Xe or Kr ions in earlier runs. As such, the fit represents a worst case, and possibly an overly pessimistic one. Most stuck bits annealed within a few hours. However, some were still present two weeks later when the shipping containers arrived back at NASA Goddard. Also, the cross section curves for block errors and SEFIs seem to scale as expected with effective LET, while that for SEUs does not. The performance of these parts was consistent with reference [6].



Fig. 4: SEU cross section vs. Effective LET for Samsung DDR2. Cross sections for ions incident off normal to the device (open symbols) do not scale as would be expected if effective LET held, especially at low LET.



Fig. 5: Block error cross section vs. Effective LET for Samsung DDR2.







Fig. 7: Stuck bit cross section vs. Effective LET for Samsung DDR2. See note in § VII regarding the multiple DDR2 hits.

#### VIII.DDR3 LASER AND HEAVY-ION RESULTS

Figs. 8-11 show SEE cross section vs. LET curves for the Samsung DDR3s. The first thing one notices about the SEE performance of the DDR3 devices is that they were significantly harder than their DDR2 counterparts. No errors of any type were seen for Ne ions at normal incidence (LET=2.1 MeV·cm<sup>2</sup>/mg). Limiting cross sections are also roughly an order of magnitude lower. Moreover, the SEU cross section vs. LET curve seems to scale roughly as expected with effective LET, in contrast to the DDR2s above. As with the DDR2s, there were no MBUs.

Many of the SEFIs seen in the DDR3s were also of a different character, exhibiting a shift where the observed data corresponded to the expected data for the next address—perhaps indicating errors in counters or circuit timing. However, probably the most notable feature of the data presented here has to do with stuck bits. While SEU, block error and SEFI behavior were all similar for both DDR3s tested, the stuck bit cross section for DUT1 was ~30× higher than that for DUT3, and DUT1 saw errors for Kr ions as well as Xe ions. Again, although most stuck bits annealed within a matter of hours, some were still present when the parts arrived back from the test.







Fig. 9: Block error cross section vs. Effective LET for Samsung DDR3.





#### IX. DISCUSSION

The results presented above are consistent with recent trends in DDR2 and DDR3 SEE performance. Neither the DDR2 nor the DDR3 were susceptible to destructive SEE. SEU rates remain manageable, and since column, row and block sizes scale with the memory size, the proportion of errors due to block errors continues to grow.

In the absence of destructive SEE susceptibility, SEFIs are the SEE mode of most concern, especially when they require a power cycle for recovery. Both the DDR2 and DDR3 exhibited such SEFIs, albeit at a low rate. Lack of statistics precludes estimating the rate for such error modes, but about 2% of SEFIs observed over all ions required a power cycle for recovery for both DDR2 and DDR3.

Stuck bits continue to be problematic for both testing and for operation in space radiation environments. For the DDR2, we saw stuck bits even down to the lowest test LETs. However, the low LET runs were carried out with parts that had already received significant fluences of Kr and Xe ions  $(-2x10^7 \text{ ions/cm}^2)$ . Thus, the low-LET stuck bits could be caused by multiple ions (heavy and light) striking the same bit. The cumulative fluences are sufficiently high that this interpretation make sense, and it also explains the difference between stuck bit results shown here and those of reference [6], where the onset LET for stuck bits was 22 MeV·cm<sup>2</sup>/mg. The stuck bit results for the DDR3 devices were also notable-mainly for the discrepancy between the susceptibility of the two DUTs. Prior irradiation history cannot explain the difference, and there appears to be a significant part-to-part variation. More thorough understanding of this variation is warranted.

Finally, the fact that SEUs were not observed during laser testing coupled with the fact that the SEU cross section scales as expected with effective LET for DDR3s, but does not scale with effective LET for DDR2s, suggests different mechanisms for the single-bit errors in the DDR2s and DDR3s.

The sheer volume of data gathered for both DDR2 and DDR3 devices attest to the reliability and performance of the evaluation boards as SEE testers—once suitable modifications had been made to their IP. As expected, no SEL or other highcurrent anomalies were seen, so the lack of ability to control power directly to the DUT posed no obstacles to gathering data. The strategy proved especially useful for testing DDR3s at speed (666.5 MHz input frequency and 1333 MHz data rate) without spending significant resources to design and build a tester capable of such data rates. This suggests that the technique could be very valuable as a first look when a new DDR generation or speed becomes available. The use of DIMMs as test parts has also proven feasible. Although initial attempts to thin a single FBGA on the DIMMs resulted in low yields due to the fragility of the DRAM die, reduced pressure and higher bit speed resulted in improved yield for the DDR3 DIMMs. Again, due to their easier availability, the use of DIMMs may be well-suited to a first look to compare SEE susceptibilities across multiple candidate parts, especially since evaluation board should be able to accommodate any DIMM of the same specification, regardless of the vendor of the FBGAs on the DIMM.

## X. CONCLUSIONS AND RECOMMENDATIONS

We carried out SEE testing of DDR2 and DDR3 SDRAMs using DIMMs as test parts and commercial FPGA-based evaluation boards as SEE testers. Although the IP for the evaluation boards required significant modification, the resulting testers performed reliably throughout the test campaigns allowing us to amass large SEE datasets for both the DDR2 and DDR3 SDRAMs. The resulting data showed that both memories were immune to SEE-induced failures. In addition, Samsung DDR3 SDRAMs seem to be harder to single-event effects than their DDR2 counterparts, both in terms of onset LET and limiting cross section for SEUs, block errors and SEFIs. The nature of SEUs in the DDR3s seems to be quite different from those in DDR2 devices. Stuck-bit susceptibility continues to be a wild card in SDRAMs and deserves further investigation-to better determine onset LET for the DDR2 and to better understand the part-to-part variation in stuck-bit susceptibility in DDR3s. We anticipate that the evaluation board/DIMM strategy will prove capable of carrying out such studies, and hope that it will also be helpful for investigating SEE susceptibility in future generations of DDR SDRAM technologies.

#### REFERENCES

- 2Gb D-die DDR3 SDRAM, http://www.samsung.com/global/business/semiconductor/file/2011/prod uct/2011/9/2/724100ds\_ddr3\_2gb\_d-die\_based\_vlp\_rdimm\_rev13.pdf, Rev. 1.13, May 2011.
- [2] http://www.xilinx.com/products/boards-and-kits/EK-V7-VC707-G.htm, 1/7/2013.
- [3] 1Gb F-die DDR2 SDRAM, http://www.samsung.com/global/business/semiconductor/file/2011/prod uct/2011/7/18/663530ds\_k4t1gxx4qf\_rev12.pdf, Rev. 1.11, Sep. 2011
- [4] http://www.xilinx.com/products/boards-and-kits/HW-V5-ML506-UNI-G.htm; 1/7/2013.
- [5] R. Koga et al., "Sensitivity of 2 Gb DDR2 SDRAMs to Protons and Heavy Ions," 2010 IEEE Radiation Effects Data Workshop (REDW), p. 6, Jul. 2010.
- [6] K. Ryu et al., "Heavy-Ion Radiation Characteristics of DDR2 Synchronous Dynamic Random Access Memory Fabricated in 56 nm Technology", J. Astron. Space Sci., vol. 29, no. 3, pp. 315-320, 2012.
- [7] M. Hermann, et al., "Heavy ion SEE test of 2 Gbit DDR3 SDRAM," 2011 European Conf. on Radiation and Its Effects on Components and Systems (RADECS), pp. 934-937, Sep. 2011.