Coffea – Columnar Object Framework For Effective Analysis
release_o66txf5f55fvteua3may7pyz3a
by
Nicholas Smith, Lindsey Gray, Matteo Cremonesi, Bo Jayatilaka, Oliver Gutsche, Allison Hall, Kevin Pedro, Maria Acosta, Andrew Melo, Stefano Belforte, Jim Pivarski
2020
Abstract
The coffea framework provides a new approach to High-Energy Physics analysis,
via columnar operations, that improves time-to-insight, scalability,
portability, and reproducibility of analysis. It is implemented with the Python
programming language, the scientific python package ecosystem, and commodity
big data technologies. To achieve this suite of improvements across many use
cases, coffea takes a factorized approach, separating the analysis
implementation and data delivery scheme. All analysis operations are
implemented using the NumPy or awkward-array packages which are wrapped to
yield user code whose purpose is quickly intuited. Various data delivery
schemes are wrapped into a common front-end which accepts user inputs and code,
and returns user defined outputs. We will discuss our experience in
implementing analysis of CMS data using the coffea framework along with a
discussion of the user experience and future directions.
In text/plain
format
Archived Files and Locations
application/pdf 231.3 kB
file_6y2v7jmzf5bthchrazxh5jfzgm
|
arxiv.org (repository) web.archive.org (webarchive) |
2008.12712v1
access all versions, variants, and formats of this works (eg, pre-prints)