Atmospheric science is the study of a large, complex system which is becoming increasingly
important to understand. There are many climate models which aim to contribute to that
understanding by computational simulation of the atmosphere. To generate these models,
and to confirm the accuracy of their outputs, requires the collection of large amounts of data.
These data are typically gathered during campaigns lasting a few weeks, during which various
sources of measurements are used. Some are ground based, others airborne sondes, but one
of the primary sources is from measurement instruments on board aircraft. Flight planning
for the numerous sorties is based on pre-determined goals with unpredictable influences,
such as weather patterns, and the results of some limited analyses of data from previous
sorties. There is little scope for adjusting the flight parameters during the sortie based on the
data received due to the large volumes of data and difficulty in processing the data online.
The introduction of unmanned aircraft with extended flight durations also requires a team
of mission scientists with the added complications of disseminating observations between
shifts.
Earth’s atmosphere is a non-linear system, whereas the data gathered is sampled at
discrete temporal and spatial intervals introducing a source of variance. Clustering data
provides a convenient way of grouping similar data while also acknowledging that, for each
discrete sample, a minor shift in time and/ or space could produce a range of values which
lie within its cluster region. This thesis puts forward a set of requirements to enable the
presentation of cluster analyses to the mission scientist in a convenient and functional manner.
This will enable in-flight decision making as well as rapid feedback for future flight planning.
Current state of the art clustering algorithms are analysed and a solution to all of the
proposed requirements is not found. New clustering algorithms are developed to achieve these
goals. These novel clustering algorithms are brought together, along with other visualization
techniques, into a software package which is used to demonstrate how the analyses can
provide information to mission scientists in flight. The ability to carry out offline analyses on
historical data, whether to reproduce the online analyses of the current sortie, or to provide
comparative analyses from previous missions, is also demonstrated. Methods for offline
analyses of historical data prior to continuing the analyses in an online manner are also
considered.
The original contributions in this thesis are the development of five new clustering
algorithms which address key challenges: speed and accuracy for typical hyper-elliptical
offline clustering; speed and accuracy for offline arbitrarily shaped clusters; online dynamic
and evolving clustering for arbitrary shaped clusters; transitions between offline and online
techniques and also the application of these techniques to atmospheric science data analysis.