

12

## Different Aspects of Mitigation: Things to Think about during Presentation • Detection: - Watchdog (state or logic monitoring) - Checking ...Decoding - Action • Masking - Not letting an error propagate to other logic - Redundancy or checking - Turn off faulty path • Correction - Error state (memory) is changed - Need feedback

To be presented by Melanie Berg at the Revolutionary Electronics in Space (ReSpace) / Military and Aerospace Programmable Logic Devices (MAPLD) 2011 Conference, Albuquerque, NM, August 22-25, 2011, and to be published on nepp.nasa.gov website.







## Agenda

- Section I: Single Event Effects (SEEs) in Digital Logic
- Section II: Application of the NASA Goddard Radiation Effects and Analysis Group (REAG) FPGA SEU Model
- Section III: Reducing System Error: Common Mitigation Techniques

## Break

- Section IV: When Your Mitigation Fails
- Section V: Xilinx V4 and Mitigation
- Section VI: Fail-Safe Strategies

Agenda (First Half) • Section I: SEEs in Digital Logic • Section II: Application of the NASA Goddard Radiation REAG FPGA SEU Model • Section III: Reducing System Error: Common Mitigation Techniques















| ASA Goddard F<br>odel : Top Dow                                                        | <br>SEU 🐨 |
|----------------------------------------------------------------------------------------|-----------|
| Model has 3 maj<br>- P <sub>Configurat ion</sub> + P<br>configuration σ <sub>SEU</sub> | 010       |
| ■ Revolutionary Electronics in Space (ReSpace) / Mills                                 | vices 14  |













































 $P(fs)_{error} \propto P_{Configuration} + P(fs)_{functionalLogil} + P_{SEFI}$ Functional Data Path SEU Cross Sections and DFF Effects (Capturing StartPoint SEUs)  $P(fs)_{functional Logic}$  $P(fs)_{SET \to SEU}$ 37







| Generation $P_{DFFSEU}$ versus<br>Capture $P(fs)_{DFFSEU \rightarrow SEU}$ |                                                                                |  |
|----------------------------------------------------------------------------|--------------------------------------------------------------------------------|--|
| PDFFSEU                                                                    | P(fs)_DFFSEU-SEU                                                               |  |
| Probability a StartPoint<br>DFF becomes upset                              | Probability that the<br>StartPoint upset is<br>captured by the endpoint<br>DFF |  |
| Occurs at some point in time within a clock period                         | Occurs at a clock edge (capture)                                               |  |
| Not frequency dependent                                                    | Frequency dependent                                                            |  |













| How DFF or Combinatorial Logic<br>Dominance Affects σ <sub>SEU</sub> |                                                                                           |                                                                                                    |  |
|----------------------------------------------------------------------|-------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|--|
| CALL PAG                                                             | $P(fs)_{DFFSEU \rightarrow SEU}$                                                          | $P(fs)_{DFFSEU \rightarrow SEU} P(fs)_{SET \rightarrow SEU}$                                       |  |
| Logic                                                                | DFF Capture                                                                               | Combinatorial SET<br>Capture                                                                       |  |
| Capture percentage<br>of clock period                                | $(1 - \frac{\tau_{dly}}{\tau_{clk}}) = (1 - \tau_{dly} fs)$                               | $\frac{\tau_{width}}{\tau_{clk}} = \tau_{width} fs$                                                |  |
| Frequency<br>Dependency                                              | Increase Frequency<br>decrease O <sub>SEU</sub>                                           | Increase Frequency<br>Increase O <sub>SEU</sub>                                                    |  |
| Combinatorial Logic<br>Effects                                       | Increase<br>Combinatorial logic<br>increases $\tau_{dly}$ and<br>decreases $\sigma_{seu}$ | Increase in<br>combinatorial logic<br>increases P <sub>gen</sub> and<br>increases O <sub>SEU</sub> |  |

















| Mai | orityVoter = | $I1 \wedge I2 + I0$ | $\wedge I2 + I0 \wedge I1$ |
|-----|--------------|---------------------|----------------------------|
| 10  | 11           | 12                  | Majority Voter             |
| 0   | 0            | 0                   | 0                          |
| 0   | 0            | 1                   | 0                          |
| 0   | 1            | 0                   | 0                          |
| 0   | 1            | 1                   | 1                          |
| 1   | 0            | 0                   | 0                          |
| 1   | 0            | 1                   | 1                          |
| 1   | 1            | 0                   | 1                          |
| 1   | 1            | z olyt O            | 5 31                       |











| No-TMR vs. LTMR:<br>Combinatorial Logic Effects    |                                                                            |                                                                                                         |  |
|----------------------------------------------------|----------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|--|
|                                                    | No-TMR ProASIC3                                                            | LTMR ProASIC3<br>Combinatorial: SET capture<br>P <sub>gen</sub> P <sub>prop</sub> t <sub>width</sub> fs |  |
| Significant circuit type                           | StartPoint DFF<br>(sequential): SEU capture                                |                                                                                                         |  |
| Significant P<br>Model<br>component                | $P_{DFFSEU}(1-\tau_{dly}fs)$                                               |                                                                                                         |  |
| Error Type                                         | One sided function                                                         | Two-sided function                                                                                      |  |
| σ <sub>SEU</sub> WSR8 vs.<br>σ <sub>SEU</sub> WSR0 | $\sigma_{SEU} WSR_8 < \sigma_{SEU} WSR_0$                                  | $\sigma_{SEU} WSR_8 > \sigma_{SEU} WSR_0$                                                               |  |
| Relative σ <sub>seu</sub><br>reasoning             | WSR8 has more<br>combinatorial Logic and<br>more $\tau_{dly}$ between DFFs | WSR8 has more<br>combinatorial Logic and has<br>more opportunity for SET<br>generation                  |  |











































|                                       | Probability Error I | Error Rate                      | LEO<br>Upsets        | GEO<br>Upsets        |
|---------------------------------------|---------------------|---------------------------------|----------------------|----------------------|
|                                       |                     |                                 | device-day           | device-day           |
| Configuration<br>Memory:<br>XQR4VSX55 | Pconfiguration      | $\frac{dE_{configuration}}{dt}$ | 7.43                 | 4.2                  |
| Combined<br>SEFIs per<br>device       | P <sub>SEFI</sub>   | $\frac{dE_{SEFI}}{dt}$          | 7.5x10 <sup>-5</sup> | 2.7x10 <sup>-5</sup> |









## How Safe is Your Design?

- Understand the SEU error mode specifics?
- Are there lock-up conditions in my design?
- Does your strategy protect the entire critical path?
- Is the synthesized design fail-safe?
- Did you mitigate where you expected to mitigate?
- Can your watch-dog catch failure?
- Will your recovery scheme work?

to be presented by Melanie Berg at the F

What are the limitations of your verification strategy?

The list goes on... Based on error signatures of the target FPGA, the designer must keep all points in mind at all stages of the design

ce) / Military and Aerospace Programmable Logic Device.

Conclusion Understand the device's error signatures and upset rates before mitigation is implemented Not all designs are critical and may not need mitigation Be aware when correction is necessary: - Make sure you are correcting your state - Masking without correction can incur error accumulation and eventually break Detection circuits don't generally have redundancy and can be susceptible - make sure they are not making your design more susceptible (e.g. state machines) Perform proper trade studies to determine the type of mitigation necessary to meet requirements: - Upset rates - Area+Power - Complexity... completion and verification with time specified ce (ReSpace) / Military and A

To be presented by Melanie Berg at the Revolutionary Electronics in Space (ReSpace) / Military and Aerospace Programmable Logic Devices (MAPLD) 2011 Conference, Albuquerque, NM, August 22-25, 2011, and to be published on nepp.nasa.gov website.