

## Analysis for Approximate Computing Systems

A.Vallero, A.Savino, G.Politano, **S.Di Carlo**, A.Chatzidimitriou, S.Tselonis, M.Kaliorakis, D.Gizopoulos, M.Riera, R.Canal, A.Gonzalez M.Kooli, A.Bosio, G.Di Natale

2nd Workshop On Approximate Computing (WAPCO 2016)-In conjunction with HiPEAC 2016, Prague, 18-20 January 2016



## OUTLINE

- MOTIVATIONS
- SYSTEM RELIABILITY ANALISYS
- EXPERIMENTS
- CONCLUSIOSN

.....



## OUTLINE

- MOTIVATIONS
- SYSTEM RELIABILITY ANALISYS
- EXPERIMENTS
- CONCLUSIOSN

......

......................



## **MOTIVATIONS** - CROSS LAYER RELIABILITY

How do we manage reliability of digital systems today?

Error management solutions at all design layers are feasible: **technology**, **hardware**, **software**, etc.

# What's the best combination?





#### **MOTIVATIONS** - RELIABLE VS APPROXIMATE COMPUTING





## **MOTIVATIONS** - OBJECTIVE OF THE WORK

 How much Approximate Computing Systems can afford to reduce margins and redundancy?

- Low margins and low redundancy mean higher raw error rate

- Some applications can tolerate inaccurate results
- Errors are often masked by several layers of hardware and software



## WE PROVIDE TOOLS TO EVALUATE SYSTEM RELIABILITY EARLY IN THE DESIGN CYCLE.



## OUTLINE

- MOTIVATIONS
- SYSTEM RELIABILITY ANALISYS
- EXPERIMENTS
- CONCLUSIOSN



## SYSTEM RELIABILITY ANALYSIS

## **Component-Based reliability model**



Reliability estimated using parameters of individual **components** (e.g., FIT, size, complexity, etc.)...

Complexity

and their interconnections (the **system architecture**).

Simplicit

## Hierarchical



Hierarchical analysis to manage complexity

## **Statistical reasoning**



Enable statistical reasoning on system level reliability

Clarity



## SYSTEM RELIABILITY ANALYSIS

- Our model exploits Bayesian Networks (BNs) as a statistical foundation for full system reliability estimation.
- Why?
  - Efficient calculation scheme,
  - Intuitive representation of all system components,
  - Capability of fitting on field data,
  - Compact representation and decision support



## SYSTEM RELIABILITY ANALYSIS

## **QUALITATIVE MODEL**



Models the architecture of the system:

- Nodes correspond to components,
- Arcs define temporal or physical relations among components

## **QUANTITATIVE MODEL**



Models state probabilities as a set of Conditional Probability Tables (CPT).



#### SYSTEM RELIABILITY ANALYSIS - QUALITATIVE MODEL



#### CLERECO FP7 Collaboration Project - http://www.clereco.eu



## SYSTEM RELIABILITY ANALYSIS - QUANTITATIVE MODEL

- Building the quantitative model can be both difficult and time consuming
- It is typically an assignment given to a group of specialists that need to collect information and organize them according to the model
- We provide an ecosystem of tools able to compute CPTs for major classes of software and hardware modules





#### SYSTEM RELIABILITY ANALYSIS - QUANTITATIVE MODEL





#### SYSTEM RELIABILITY ANALYSIS - REASONING



Predictive reasoning

Starting from information about causes (i.e., raw technology failure rates) to new beliefs about their effects (i.e., system failures), following the forward directions of the network arcs.



#### SYSTEM RELIABILITY ANALYSIS - REASONING



.......................



#### SYSTEM RELIABILITY ANALYSIS - REASONING



**Diagnostic reasoning** 

Reasoning from symptoms to cause, such as when we observe a failure in the system, we can update our belief about the contribution of each node (hardware or software component) to this failure.



## OUTLINE

- MOTIVATIONS
- SYSTEM RELIABILITY ANALISYS
- EXPERIMENTS
- CONCLUSIOSN

.....



#### EXPERIMENTAL SETUP

- Technology Domain:
  - 22nm Bulk Planar (FIT: 194,7E-7 single bit flip FIT rate 6T SRAM cells under typical conditions 1V, 50°C)
- Hardware Domain
  - x86 out-of-order CPU and ARM Cortex A15 out-of-order CPU
    - Register file (256 regs each 64-bits) 2KB, L1 Instruction Cache (2KB), L1 Data Cache (32 KB), L2 Cache (1MB), Load/Store Queue (128B)
  - ECC Protected DRAM
- Software Domain
  - Linux operating system executing one of the following MiBench programs: (1) susan smooth, (2) susan edges, (3) susan corners, (4) qsort, (5) string search, (6) sha, (7) jpeg decode, (8) jpeg encode, (9) aes decode, (10) fft



#### EXPERIMENTAL RESULTS - SETUP



Component characterization performed using statistical fault sampling according to [Leveugle et al. DATE 09] with 3% error margin and 99% CL.



#### **EXPERIMENTAL RESULTS** - ACCURACY



FIT estimation for the 10 selected benchmarks running on the X86 based architecture.



#### **EXPERIMENTAL RESULTS** - ACCURACY



FIT estimation for the 10 selected benchmarks running on the ARM A15 architecture.

.....................



#### **EXPERIMENTAL RESULTS** – SIMULATION TIME



#### Performance comparisons (hours of simulation).



#### **EXPERIMENTAL RESULTS** – DIAGNOSTIC REASONING



#### Example of backward reasoning for the x86 fft configuration



#### **EXPERIMENTAL RESULTS** – DESIGN EXPLORATION

| Vari<br>ant | Rev<br>erse<br>Bits | fff_fl<br>oat | L2 |  |
|-------------|---------------------|---------------|----|--|
| v1          | U                   | U             | U  |  |
| v2          | FT                  | U             | U  |  |
| v3          | U                   | FT            | U  |  |
| ∨4          | U                   | U             | FT |  |
| ∨5          | U                   | FT            | FT |  |
| V6          | FT                  | FT            | FT |  |



Example of design exploration and optimization



#### **EXPERIMENTAL RESULTS** – COMPARISON WITH ACE ANALISYS

| 6.00E-002   |        |        |         |
|-------------|--------|--------|---------|
| 5.00E-002 — |        |        |         |
| 4.00E-002 — |        |        |         |
| 3.00E-002 — |        |        |         |
| 2.00E-002   |        |        |         |
| 1.00E-002   |        |        |         |
| 0.00E+000   |        |        |         |
|             | FIT-FI | FIT-BN | FIT-ACE |

## Comparison with AVF computed through ACE analysis for the string search benchmark.

AVF computed based on data extrapolated from [George et. al DSN'10]



## OUTLINE

- MOTIVATIONS
- SYSTEM RELIABILITY ANALISYS
  - TECHNOLOGY/ENVIRONMENT ANALYZER
  - HARDWARE ANALYZER
  - SOFTWARE ANALYZER
  - STATISTICAL REASONING
- EXPERIMENTS
- CONCLUSIOSN



## CONCLUSIONS

- We presented a full framework for system level reliability analysis
  - Enables analysis early in the design cycle to enable design exploration
  - Provides a full ecosystem of tools to help designers building the reliability model
  - Provides very accurate results with reduced computation time



## FOLLOW US



f y in

http://www.clereco.eu

Clereco.eu



CLERECO FP7 Collaboration Project – http://www.clereco.eu



## Canke 訪訪 ngiyabonga dank je dank je ekkür ederim Garacias mochchakkeram go raibh maith agat arigato algo dakujem спасибо dziękuję **Sukriya** kop khun krap grazie obrigado мерси Invation in the second 감사합니다