Statistical extreme value models allow estimation of the frequency, magnitude and spatio-temporal extent of extreme temperature events in the presence of climate change. Unfortunately, the assumptions of many standard methods are not valid for complex environmental data sets, with a realistic statistical model requiring appropriate incorporation of scientific context. We examine two case studies in which the application of routine extreme value methods result in inappropriate models and inaccurate predictions. In the first scenario, record-breaking temperatures experienced in the US in the summer of 2021 are found to exceed the maximum feasible temperature predicted from a standard extreme value analysis of pre-2021 data. Incorporating random effects into the standard methods accounts for additional variability in the model parameters, reflecting shifts in unobserved climatic drivers and permitting greater accuracy in return period prediction. The second scenario examines ice surface temperatures in Greenland. The temperature distribution is found to have a poorly-defined upper tail, with a spike in observations just below 0◦C and an unexpectedly large number of measurements above this value. A Gaussian mixture model fit to the full range of measurements improves fit and predictive abilities in the upper tail when compared to traditional extreme value methods.