Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures release_way7n4tdkfe6hlmrdn2mxecaym

by Rajiv Nishtala and Xavier Martorell Norwegian University of Science and Technology, Barcelona Supercomputing center)

Released as a article .

2019  

Abstract

Heterogeneous multi-core systems such as big/little architectures have been introduced as an attractive server design option with the potential to improve performance under power constraints in data centres. Since both big high-performing and little power-efficient cores can run on the same system sharing the workload processing, thread mapping/scheduling turns out to be much more challenging. This is particularly hard when considering the different trade-offs shaped by the heterogeneous cores on the quality-of-service (expressed as tail latency) experienced by user-facing applications, such as Web Search. In this work, we present Hurry-up, a runtime thread mapping solution designed to select individual requests to run on the most appropriate heterogeneous cores to improve tail latency. Hurry-up accelerates compute-intensive requests on big cores, while letting less intensive threads to execute on little cores. We implement and deploy Hurry-up on a real 64-bit big/little architecture (ARM Juno), and show that, compared to a conservative policy on Linux, Hurry-up reduces the server tail latency by 39.5% (mean).
In text/plain format

Archived Files and Locations

application/pdf  561.3 kB
file_zt5skypaofbfngldtys2voemou
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2019-12-20
Version   v1
Language   en ?
arXiv  1912.09844v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: e570398b-4935-4378-b0ce-91474b7f47c6
API URL: JSON