Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era release_s5gixmwzdrh7nhjwmfyp22qywq

by Ardavan Pedram, Stephen Richardson, Sameh Galal, Shahar Kvatinsky, and Mark A. Horowitz

Released as a article .

2016  

Abstract

The key challenge to improving performance in the age of Dark Silicon is how to leverage transistors when they cannot all be used at the same time. In modern SOCs, these transistors are often used to create specialized accelerators which improve energy efficiency for some applications by 10-1000X. While this might seem like the magic bullet we need, for most CPU applications more energy is dissipated in the memory system than in the processor: these large gains in efficiency are only possible if the DRAM and memory hierarchy are mostly idle. We refer to this desirable state as Dark Memory, and it only occurs for applications with an extreme form of locality. To show our findings, we introduce Pareto curves in the energy/op and mm^2/(ops/s) metric space for compute units, accelerators, and on-chip memory/interconnect. These Pareto curves allow us to solve the power, performance, area constrained optimization problem to determine which accelerators should be used, and how to set their design parameters to optimize the system. This analysis shows that memory accesses create a floor to the achievable energy-per-op. Thus high performance requires Dark Memory, which in turn requires co-design of the algorithm for parallelism and locality, with the hardware.
In text/plain format

Archived Files and Locations

application/pdf  1.3 MB
file_yq4qxmg26nhp3jn6nsqoidja2m
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2016-04-24
Version   v2
Language   en ?
arXiv  1602.04183v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 023e1d65-8ada-418a-8e35-10a01075d672
API URL: JSON