Škoda Petr
Astronomical Institute of the Czech Academy of Sciences, Ondrejov
Czech Republic

Miscellaneous Information

Miscellaneous Information

Abstract Reference: 30801
Identifier: P1.28
Presentation: Poster presentation
Key Theme: 1 Reduction and Analysis Algorithms for Large Databases and Vice-versa

Using Machine Learning for Identification of Artifacts and Interesting Celestial Objects in LAMOST Spectral Survey

Škoda Petr, Shakurova Ksenia , Palička Andrej

The LAMOST DR1 survey contains about two million of spectra labelled by its pipeline as stellar objects of common spectral classes. There is, however, a lot of spectra corrupted in some way by both instrumental and  processing artifacts, which may mimic  spectral properties of interesting celestial objects, namely emission lines of Be stars and quasars. We have tested several clustering methods as well as outliers analysis on a sample of one hundred thousand spectra using Spark scripts running on Hadoop cluster consisting of twenty-four sixteen-core nodes. This experiment was motivated by  an attempt to find rare objects with interesting spectra as outliers most dissimilar from all common spectra. The result of this time-consuming procedure is a list of several hundred candidates where different artifacts are prominent, but also tens of very interesting emission-line spectra requiring further detailed examination. Many of them may be quasars or even blazars as well as yet unknown Be-stars. It deserves mentioning that most of the  work  benefitted considerably from technologies of Virtual Observatory.