Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures
release_way7n4tdkfe6hlmrdn2mxecaym
by
Rajiv Nishtala
and Xavier Martorell Norwegian University of Science and Technology,
Barcelona
Supercomputing center)
2019
Abstract
Heterogeneous multi-core systems such as big/little architectures have been
introduced as an attractive server design option with the potential to improve
performance under power constraints in data centres. Since both big
high-performing and little power-efficient cores can run on the same system
sharing the workload processing, thread mapping/scheduling turns out to be much
more challenging. This is particularly hard when considering the different
trade-offs shaped by the heterogeneous cores on the quality-of-service
(expressed as tail latency) experienced by user-facing applications, such as
Web Search.
In this work, we present Hurry-up, a runtime thread mapping solution designed
to select individual requests to run on the most appropriate heterogeneous
cores to improve tail latency. Hurry-up accelerates compute-intensive requests
on big cores, while letting less intensive threads to execute on little cores.
We implement and deploy Hurry-up on a real 64-bit big/little architecture (ARM
Juno), and show that, compared to a conservative policy on Linux, Hurry-up
reduces the server tail latency by 39.5% (mean).
In text/plain
format
Archived Files and Locations
application/pdf 561.3 kB
file_zt5skypaofbfngldtys2voemou
|
arxiv.org (repository) web.archive.org (webarchive) |
1912.09844v1
access all versions, variants, and formats of this works (eg, pre-prints)