Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
release_rsiwu6ltsbbfrm7iwi2jfnppfi
by
Yoshitomo Matsubara, Marco Levorato, Francesco Restuccia
2021
Abstract
Mobile devices such as smartphones and autonomous vehicles increasingly rely
on deep neural networks (DNNs) to execute complex inference tasks such as image
classification and speech recognition, among others. However, continuously
executing the entire DNN on the mobile device can quickly deplete its battery.
Although task offloading to edge servers may decrease the mobile device's
computational burden, erratic patterns in channel quality, network and edge
server load can lead to a significant delay in task execution.
Recently,approaches based on split computing (SC) have been proposed, where the
DNN is split into a head and a tail model, executed respectively on the mobile
device and on the edge server. Ultimately, this may reduce bandwidth usage as
well as energy consumption. Another approach, called early exiting (EE), trains
models to present multiple "exits" earlier in the architecture, each providing
increasingly higher target accuracy. Therefore, the trade-off between accuracy
and delay can be tuned according to the current conditions or application
demands. In this paper, we provide a comprehensive survey of the state of the
art in SC and EE strategies, by presenting a comparison of the most relevant
approaches. We conclude the paper by providing a set of compelling research
challenges.
In text/plain
format
Archived Files and Locations
application/pdf 4.7 MB
file_a4jrm2objbghrk4i4ipbtteewa
|
arxiv.org (repository) web.archive.org (webarchive) |
2103.04505v2
access all versions, variants, and formats of this works (eg, pre-prints)