Home > Research > Publications & Outputs > Constructing Corpora from Images and Text


Text available via DOI:

View graph of relations

Constructing Corpora from Images and Text: An introduction to Visual Constituent Analysis

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter

Publication date30/11/2020
Host publicationCorpus Approaches to Social Media
EditorsSofia Rüdiger , Daria Dayter
PublisherJohn Benjamins
Number of pages26
ISBN (electronic)9789027260499
ISBN (print)9789027207944
<mark>Original language</mark>English

Publication series

NameStudies in Corpus Linguistics
ISSN (electronic)1388-0373


Visual analysis represents a significant oversight in the corpus literature, and possibly one that may lead to unintended omissions, particularly when analysing social media. In this chapter we introduce Visual Constituent Analysis (VCA), a method of multimodal corpus construction that allows researchers to construct and analyse visual aspects of online media in large-scale corpora. The chapter addresses the shortcomings of a purely textual approach to discourse analysis when dealing with social media texts and offers a solution using computer ‘Vision’-based image annotation (in our case Google Cloud Vision). Finally, we demonstrate how our approach can be used to analyse a sample of 150,000 micro-blog posts from Twitter and show the difference in level of user interaction with combined image/texts over language-only social media texts.