Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges release_rsiwu6ltsbbfrm7iwi2jfnppfi

by Yoshitomo Matsubara, Marco Levorato, Francesco Restuccia

Released as a article .

2021  

Abstract

Mobile devices such as smartphones and autonomous vehicles increasingly rely on deep neural networks (DNNs) to execute complex inference tasks such as image classification and speech recognition, among others. However, continuously executing the entire DNN on the mobile device can quickly deplete its battery. Although task offloading to edge servers may decrease the mobile device's computational burden, erratic patterns in channel quality, network and edge server load can lead to a significant delay in task execution. Recently,approaches based on split computing (SC) have been proposed, where the DNN is split into a head and a tail model, executed respectively on the mobile device and on the edge server. Ultimately, this may reduce bandwidth usage as well as energy consumption. Another approach, called early exiting (EE), trains models to present multiple "exits" earlier in the architecture, each providing increasingly higher target accuracy. Therefore, the trade-off between accuracy and delay can be tuned according to the current conditions or application demands. In this paper, we provide a comprehensive survey of the state of the art in SC and EE strategies, by presenting a comparison of the most relevant approaches. We conclude the paper by providing a set of compelling research challenges.
In text/plain format

Archived Files and Locations

application/pdf  4.7 MB
file_a4jrm2objbghrk4i4ipbtteewa
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-09-11
Version   v2
Language   en ?
arXiv  2103.04505v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 77421476-6e1f-4981-97ea-65a29ac20639
API URL: JSON