Landry Walter
IRSA - NASA/IPAC Infrared Science Archive
United States

Key Theme: 4 Long-term Management of Data Archives

Instantaneous Archives

The NASA/IPAC Infrared Science Archive (IRSA) is one of the largest and busiest astronomy archives in the world. In the past, our main emphasis was on making new data and new capabilities available. With the widespread implementation of Virtual Observatory protocols, there are a number of useful tools that can quickly and easily perform insightful, sophisticated queries from archives around the world. The queries, if not handled quickly, can easily overwhelm the site and interfere with other users. In addition, reducing latency below the point of human perception enables more interactive and exploratory science.

In this talk, I will discuss our multi-pronged efforts to improve performance on all levels. This includes: 

1) Upgrading network hardware and links.

2) Fine tuning the indexing and partitioning strategies for our traditional databases.

3) Benchmarking various spatial indexing schemes (htm, q3c, h3c, postgis).

4) Rearchitecting our query pipeline to eliminate process, filesystem, and database connection overheads.

Taken together, these improvements have delivered radical, order of magnitude improvements in latency and throughput. I will also discuss how distributed and in-memory databases could be used to improve performance even more.