Home > Research > Publications & Outputs > What does the Strange Stories test measure?


Text available via DOI:

View graph of relations

What does the Strange Stories test measure?: Developmental and within-test variation

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Article number101289
<mark>Journal publication date</mark>31/01/2023
<mark>Journal</mark>Cognitive Development
Number of pages12
Publication StatusPublished
Early online date10/01/23
<mark>Original language</mark>English


Happé’s (1994) Strange Stories have been widely used to assess advanced theory of mind understanding in several clinical populations, but recent analyses have cast doubt on the links between it and other related measures of this skill.

This study tested 210 Pakistani and 46 British children to assess the developmental trajectory of performance across a 6-year age span in the test, and also to explore differences between and within the four most used sub-tests (Misunderstanding, Persuasion, White Lies and Double Bluff).

There were significant developmental differences in children’s overall understanding of the Stories and between not only the four sub-tests but also individual questions purporting to assess the same construct. Partial correlations, controlling for the age (in months) and SES produced inconsistent correlations between stories assessing the same construct (e.g. Double Bluff stories). Factor analysis also revealed two factors and for the two sub-tests (double bluff and misunderstanding), each story loaded onto a separate factor, contrasting the assumption that the Strange Stories assess the same underlying ability. Moreover, GLMM analyses showed that the model with two main effects (age and SES) fitted the best and age emerged as a major predictor. Post hoc analyses showed that performance on White lie (used as a baseline) was higher than on Persuasion and Double Bluff. Similar, but not identical patterns were found in a comparison between the six- and eight-year-olds in the two cultures, with children in the UK outperforming those in Pakistan.

The results suggest that the test is less homogeneous than has been assumed. Relationships with other measures and diagnoses might only apply to subsets of the questions. The need for standardization is clear.