Contact

Nullmeier Markus
Position:
German Astrophysical Virtual Observatory (GAVO), University of Heidelberg, Heidelberg, Germany
Address
Germany

Miscellaneous Information

Miscellaneous Information

Abstract Reference: 30842
Identifier: O8.2
Presentation: Oral communication
Key Theme: 1 Reduction and Analysis Algorithms for Large Databases and Vice-versa 

Using PostgreSQL for efficient handling of spatial data

Authors:
Nullmeier Markus, Bartunov Oleg, Karpov Sergey, Korotkov Alexander, Ivanov Dmitry, 

The production of contemporary observational and simulation data in astronomy typically results in very large databases. Two of the related fundamental challenges in handling these are fast spatial queries on the celestial sphere, and, for projects such as progressive surveys, high rates of observation updates. A standard choice for a database management system in this context is one of the relational type, such as the open source product PostgreSQL. For the tasks mentioned above, we report a comparative efficiency study of a variety of spatial indexing methods for PostgreSQL.

We used the principal benchmark operations of radial query (cone search) and distance-based join (cross match), as well as insertion of single points with online index updates. All tests were run on synthetic data in order to analyze both run time and number of data block accesses as a function of database size. The spatial indexing methods we investigated were three-dimensional R-trees (pgSphere, PostGIS), Quad Tree Cubes (Q3C), Healpix (H3C), and Hierarchical Triangular Mesh (HTM).

The results show that, with the latest version of the pgSphere software, R-trees perform on par with the other methods, which are based on storing index numbers of sky regions in standard database (B-tree) indexes. This is due to a novel R-tree node splitting algorithm, and, for calculating cross matches, a method of hierarchical descent for the two respective indices. The development of these efficient algorithms was in particular enabled by specific extensibility features of PostgreSQL. The latter now include a device to automatically optimize regular SQL queries for spatial joins to invoke specialized cross match routines without any user intervention, which is available for pgSphere.

In summary, PostgreSQL proved to be an appropriate tool for handling astronomical databases from small to very large, especially in terms of usability and of absolute performance.