Home > Research > Activities > Corpus Framework Analysis: Integrating computat...
View graph of relations

Corpus Framework Analysis: Integrating computational linguistics, corpus linguistics, and clinical psychology to analyse online posts on personal recovery in bipolar disorder

Activity: Talk or presentation typesOral presentation


The concept of personal recovery, ‘a way of living a satisfying, hopeful life even with the limitations caused by the illness’ (Anthony, 1993) is of particular value in bipolar disorder where symptoms often persist despite adequate treatment, but the topic has been under-researched. A recent systematic review defined the first conceptual framework for personal recovery in bipolar disorder, POETIC (Purpose & meaning, Optimism & hope, Empowerment, Tensions, Identity, Connectedness) (Jagfeld, Lobban, Marshall, et al., 2021). So far, personal recovery has only been studied in researcher-constructed environments (interviews, focus groups). Peer online support forum posts can serve as a complementary source of non-reactive data to study health beliefs and experiences.
By integrating corpus and computational linguistics and health research methods, this study analysed a corpus of public bipolar support forum posts from the discussion platform Reddit in relation to the lived experience of personal recovery. As people talk about a wide variety of topics on Reddit, selecting what is relevant presents a challenge in working with non-reactive data and led to our innovative corpus construction process. Starting from a 1B word dataset of Reddit posts by people with a self-reported bipolar disorder diagnosis (Jagfeld, Lobban, Rayson, et al., 2021), a series of automatic filtering steps involving computational linguistic methods and manual coding resulted in the 1.3M word Personal recovery in bipolar disorder (PR-BD) corpus of personal recovery-relevant posts.
To analyse the PR-BD corpus, 130 key lemmas in the PR-BD corpus compared to a 5M word reference corpus of non personal recovery-relevant posts were coded into the POETIC framework via concordance analysis using #LancsBox 6.0. This constitutes a novel integration of corpus and computational linguistics and deductive framework analysis, which we have named corpus framework analysis (CFA). CFA results show that three POETIC domains featured most in discussions on Reddit: Purpose & meaning (particularly reproductive decision-making, work), Connectedness (romantic relationships and social support), and Empowerment (self-management and personal responsibility). Overall, CFA confirmed that the POETIC framework also usefully captured personal experiences shared online. Moreover, it successfully highlighted which issues people focused more on in online support forums compared to existing evidence from interviews and focus groups.
This study is the first to analyse non-reactive data on personal recovery in bipolar disorder. Indicating the key areas that people focus on in personal recovery when posting freely and the language they use, it provides helpful starting points for therapists to collaboratively consider these issues with service users in clinical settings, such as recovery-oriented cognitive behavioural
therapy. CFA is a promising new method which combines corpus linguistics and qualitative framework analysis that may well be useful in addressing other health and sociological research questions.

Anthony, W. A. (1993). Recovery from mental illness: the guiding vision of the mental health system in the 1990s. Psychosocial Rehabilitation Journal, 16(4), 11–23.
Jagfeld, G., Lobban, F., Marshall, P., & Jones, S. H. (2021). Personal recovery in bipolar disorder: Systematic review and “best fit” framework synthesis of qualitative evidence – a POETIC adaptation of CHIME. Journal of Affective Disorders, 292, 375–385. https://doi.org/10.1016/j.jad.2021.05.051
Jagfeld, G., Lobban, F., Rayson, P., & Jones, S. H. (2021). Understanding who uses Reddit: Profiling individuals with a self-reported bipolar disorder diagnosis. Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access at NAACL 2021.

Electronic data

Event (Conference)

TitleCADS Conf 2022 - 6th Corpora & Discourse International Conference
Degree of recognitionInternational event