Concurrent CPU-GPU Task Programming using Modern C++ release_rff642orzrff3k56rofcbux3ve

by Tsung-Wei Huang, Yibo Lin

Released as a article .

2022  

Abstract

In this paper, we introduce Heteroflow, a new C++ library to help developers quickly write parallel CPU-GPU programs using task dependency graphs. Heteroflow leverages the power of modern C++ and task-based approaches to enable efficient implementations of heterogeneous decomposition strategies. Our new CPU-GPU programming model allows users to express a problem in a way that adapts to effective separation of concerns and expertise encapsulation. Compared with existing libraries, Heteroflow is more cost-efficient in performance scaling, programming productivity, and solution generality. We have evaluated Heteroflow on two real applications in VLSI design automation and demonstrated the performance scalability across different CPU-GPU numbers and problem sizes. At a particular example of VLSI timing analysis with million-scale tasking, Heteroflow achieved 7.7x runtime speed-up (99 vs 13 minutes) over a baseline on a machine of 40 CPU cores and 4 GPUs.
In text/plain format

Archived Files and Locations

application/pdf  522.6 kB
file_7iwdmzal2nfn5f7uwiu653fkhm
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2022-03-16
Version   v1
Language   en ?
arXiv  2203.08395v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: f45a14b4-67c7-424f-b9ef-856d2025de71
API URL: JSON