Final published version
Licence: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Transfer learning for galaxy feature detection
T2 - Finding giant star-forming clumps in low-redshift galaxies using Faster Region-based Convolutional Neural Network
AU - Popp, Jürgen J
AU - Dickinson, Hugh
AU - Serjeant, Stephen
AU - Walmsley, Mike
AU - Adams, Dominic
AU - Fortson, Lucy
AU - Mantha, Kameswara
AU - Mehta, Vihang
AU - Dawson, James M
AU - Kruk, Sandor
AU - Simmons, Brooke
PY - 2024/12/31
Y1 - 2024/12/31
N2 - Giant star-forming clumps (GSFCs) are areas of intensive star-formation that are commonly observed in high-redshift (z ≳ 1) galaxies but their formation and role in galaxy evolution remain unclear. Observations of low-redshift clumpy galaxy analogues are rare but the availability of wide-field galaxy survey data makes the detection of large clumpy galaxy samples much more feasible. Deep Learning (DL), and in particular Convolutional Neural Networks (CNNs), have been successfully applied to image classification tasks in astrophysical data analysis. However, one application of DL that remains relatively unexplored is that of automatically identifying and localizing specific objects or features in astrophysical imaging data. In this paper, we demonstrate the use of DL-based object detection models to localize GSFCs in astrophysical imaging data. We apply the Faster Region-based Convolutional Neural Network object detection framework (FRCNN) to identify GSFCs in low-redshift (z ≲ 0.3) galaxies. Unlike other studies, we train different FRCNN models on observational data that was collected by the Sloan Digital Sky Survey and labelled by volunteers from the citizen science project ‘Galaxy Zoo: Clump Scout’. The FRCNN model relies on a CNN component as a ‘backbone’ feature extractor. We show that CNNs, that have been pre-trained for image classification using astrophysical images, outperform those that have been pre-trained on terrestrial images. In particular, we compare a domain-specific CNN – ‘Zoobot’ – with a generic classification backbone and find that Zoobot achieves higher detection performance. Our final model is capable of producing GSFC detections with a completeness and purity of ≥0.8 while only being trained on ∼5000 galaxy images.
AB - Giant star-forming clumps (GSFCs) are areas of intensive star-formation that are commonly observed in high-redshift (z ≳ 1) galaxies but their formation and role in galaxy evolution remain unclear. Observations of low-redshift clumpy galaxy analogues are rare but the availability of wide-field galaxy survey data makes the detection of large clumpy galaxy samples much more feasible. Deep Learning (DL), and in particular Convolutional Neural Networks (CNNs), have been successfully applied to image classification tasks in astrophysical data analysis. However, one application of DL that remains relatively unexplored is that of automatically identifying and localizing specific objects or features in astrophysical imaging data. In this paper, we demonstrate the use of DL-based object detection models to localize GSFCs in astrophysical imaging data. We apply the Faster Region-based Convolutional Neural Network object detection framework (FRCNN) to identify GSFCs in low-redshift (z ≲ 0.3) galaxies. Unlike other studies, we train different FRCNN models on observational data that was collected by the Sloan Digital Sky Survey and labelled by volunteers from the citizen science project ‘Galaxy Zoo: Clump Scout’. The FRCNN model relies on a CNN component as a ‘backbone’ feature extractor. We show that CNNs, that have been pre-trained for image classification using astrophysical images, outperform those that have been pre-trained on terrestrial images. In particular, we compare a domain-specific CNN – ‘Zoobot’ – with a generic classification backbone and find that Zoobot achieves higher detection performance. Our final model is capable of producing GSFC detections with a completeness and purity of ≥0.8 while only being trained on ∼5000 galaxy images.
KW - Object Detection
KW - Deep Learning
KW - Machine Learning
KW - Transfer Learning
KW - Galaxies: Structure
KW - Data Methods
U2 - 10.1093/rasti/rzae013
DO - 10.1093/rasti/rzae013
M3 - Journal article
VL - 3
SP - 174
EP - 197
JO - RAS Techniques and Instruments
JF - RAS Techniques and Instruments
SN - 2752-8200
IS - 1
ER -