Final published version, 71.7 MB, PDF document
Research output: Thesis › Doctoral Thesis
Research output: Thesis › Doctoral Thesis
}
TY - BOOK
T1 - Synthetic Data for Machine Learning
AU - Kerim, Abdulrahman
PY - 2024
Y1 - 2024
N2 - Supervised machine learning methods require large-scale training datasets toconverge. Collecting and annotating training data is expensive, time-consuming, error-prone, and not always practical. Usually, synthetic data is used as a feasible data source to increase the amount of training data. However, just directly using synthetic data may actually harm the model’s performance or may not be as effective as it could be. This thesis addresses the challenges of generating large-scale synthetic data, improving domain adaptation in semantic segmentation, advancing video stabilization in adverse conditions, and conducting a rigorous assessment of synthetic data usability in classification tasks. By contributing novel solutions to these multifaceted problems, this work bolsters the field of computer vision, offering strong foundations for a broad range of applications for utilizing synthetic data for computer vision tasks.In this thesis, we divide the study into three main problems: (i) Tacklethe problem of generating diverse and photorealistic synthetic data; (ii) Exploresynthetic-aware computer vision solutions for semantic segmentation and videostabilization; (iii) Assess the usability of synthetically generated data for differentcomputer vision tasks.We developed a new synthetic data generator called Silver. Photo-realism,diversity, scalability, and full 3D virtual world generation at run-time are the keyaspects of this generator. The photo-realism was approached by utilizing the stateof-the-art High Definition Render Pipeline (HDRP) of the Unity game engine. In parallel, the Procedural Content Generation (PCG) concept was employed to create a full 3D virtual world at run-time, while the scalability (expansion and adaptability) of the system was attained by taking advantage of the modular approach followed as we built the system from scratch. Silver can be used to provide clean, unbiased, and large-scale training and testing data for various computer vision tasks.Regarding synthetic-aware computer vision models, we developed a novelarchitecture specifically designed to use synthetic training data for semanticsegmentation domain adaptation. We propose a simple yet powerful addition toDeepLabV3+ by using weather and time-of-the-day supervisors trained with multitask learning, making it both weather and nighttime-aware, which improves its mIoU accuracy under adverse conditions while maintaining adequate performance under standard conditions.Similarly, we also proposed a synthetic-aware adverse weather video stabilization algorithm that dispenses real data for training, relying solely on synthetic data. Our approach leverages specially generated synthetic data to avoid the feature extraction issues faced by current methods. To achieve this, we leveraged our novel data generator to produce the required training data with an automatic ground-truth extraction procedure.We also propose a new dataset called VSAC105Real and compare our methodto five recent video stabilization algorithms using two benchmarks. Our methodgeneralizes well on real-world videos across all weather conditions and does notrequire large-scale synthetic training data.Finally, we assess the usability of the generated synthetic data. We proposea novel usability metric that disentangles photorealism from diversity. This newmetric is a simple yet effective way to rank synthetic images. The quantitativeresults show that we can achieve similar or better results by training on 50% less synthetic data. Additionally, we qualitatively assess the impact of photorealism and evaluate many architectures on different datasets for that aim.
AB - Supervised machine learning methods require large-scale training datasets toconverge. Collecting and annotating training data is expensive, time-consuming, error-prone, and not always practical. Usually, synthetic data is used as a feasible data source to increase the amount of training data. However, just directly using synthetic data may actually harm the model’s performance or may not be as effective as it could be. This thesis addresses the challenges of generating large-scale synthetic data, improving domain adaptation in semantic segmentation, advancing video stabilization in adverse conditions, and conducting a rigorous assessment of synthetic data usability in classification tasks. By contributing novel solutions to these multifaceted problems, this work bolsters the field of computer vision, offering strong foundations for a broad range of applications for utilizing synthetic data for computer vision tasks.In this thesis, we divide the study into three main problems: (i) Tacklethe problem of generating diverse and photorealistic synthetic data; (ii) Exploresynthetic-aware computer vision solutions for semantic segmentation and videostabilization; (iii) Assess the usability of synthetically generated data for differentcomputer vision tasks.We developed a new synthetic data generator called Silver. Photo-realism,diversity, scalability, and full 3D virtual world generation at run-time are the keyaspects of this generator. The photo-realism was approached by utilizing the stateof-the-art High Definition Render Pipeline (HDRP) of the Unity game engine. In parallel, the Procedural Content Generation (PCG) concept was employed to create a full 3D virtual world at run-time, while the scalability (expansion and adaptability) of the system was attained by taking advantage of the modular approach followed as we built the system from scratch. Silver can be used to provide clean, unbiased, and large-scale training and testing data for various computer vision tasks.Regarding synthetic-aware computer vision models, we developed a novelarchitecture specifically designed to use synthetic training data for semanticsegmentation domain adaptation. We propose a simple yet powerful addition toDeepLabV3+ by using weather and time-of-the-day supervisors trained with multitask learning, making it both weather and nighttime-aware, which improves its mIoU accuracy under adverse conditions while maintaining adequate performance under standard conditions.Similarly, we also proposed a synthetic-aware adverse weather video stabilization algorithm that dispenses real data for training, relying solely on synthetic data. Our approach leverages specially generated synthetic data to avoid the feature extraction issues faced by current methods. To achieve this, we leveraged our novel data generator to produce the required training data with an automatic ground-truth extraction procedure.We also propose a new dataset called VSAC105Real and compare our methodto five recent video stabilization algorithms using two benchmarks. Our methodgeneralizes well on real-world videos across all weather conditions and does notrequire large-scale synthetic training data.Finally, we assess the usability of the generated synthetic data. We proposea novel usability metric that disentangles photorealism from diversity. This newmetric is a simple yet effective way to rank synthetic images. The quantitativeresults show that we can achieve similar or better results by training on 50% less synthetic data. Additionally, we qualitatively assess the impact of photorealism and evaluate many architectures on different datasets for that aim.
KW - synthetic data
KW - computer vision
KW - Semantic segmentation
KW - Video Stabilization
U2 - 10.17635/lancaster/thesis/2432
DO - 10.17635/lancaster/thesis/2432
M3 - Doctoral Thesis
PB - Lancaster University
ER -