SkyQuery: An Implementation of a Parallel Probabilistic Join Engine for
Cross-Identification of Multiple Astronomical Databases
release_vnghwx76cng2lcegskrctwwcvq
by
László Dobos, Tamás Budavári, Nolan Li, Alexander S. Szalay, István Csabai
2012
Abstract
Multi-wavelength astronomical studies require cross-identification of
detections of the same celestial objects in multiple catalogs based on
spherical coordinates and other properties. Because of the large data volumes
and spherical geometry, the symmetric N-way association of astronomical
detections is a computationally intensive problem, even when sophisticated
indexing schemes are used to exclude obviously false candidates. Legacy
astronomical catalogs already contain detections of more than a hundred million
objects while the ongoing and future surveys will produce catalogs of billions
of objects with multiple detections of each at different times. The varying
statistical error of position measurements, moving and extended objects, and
other physical properties make it necessary to perform the cross-identification
using a mathematically correct, proper Bayesian probabilistic algorithm,
capable of including various priors. One time, pair-wise cross-identification
of these large catalogs is not sufficient for many astronomical scenarios.
Consequently, a novel system is necessary that can cross-identify multiple
catalogs on-demand, efficiently and reliably. In this paper, we present our
solution based on a cluster of commodity servers and ordinary relational
databases. The cross-identification problems are formulated in a language based
on SQL, but extended with special clauses. These special queries are
partitioned spatially by coordinate ranges and compiled into a complex workflow
of ordinary SQL queries. Workflows are then executed in a parallel framework
using a cluster of servers hosting identical mirrors of the same data sets.
In text/plain
format
Archived Files and Locations
application/pdf 203.3 kB
file_j5k2c26lfrh2dc3o2xbirm5sfm
|
arxiv.org (repository) web.archive.org (webarchive) |
1206.5021v1
access all versions, variants, and formats of this works (eg, pre-prints)