Compiling Comparable Multimodal Corpora of Tourism Discourse

This paper describes work in progress on the design of two comparable multimodal corpora of written tourism discourse about London and Moscow. Multimodality is defined for the purposes of the current project as a combination of several discourse modes, including verbal and visual. The paper aims to contribute methodologically by providing a detailed description of the process and challenges of the multimodal corpora compilation. The building of the corpora is an essential precondition for using a multimodal corpus approach allowing to analyse a range of texts, to consider not only language but also images and layout, to search the data for patterns, to identify multimodal features of each set of texts and to compare these features across the two corpora. After introducing the project and its research questions, the paper highlights the principles of data selection. Then the planned structure of the corpora and data sources are described. The paper goes on by describing the constructed pilot corpora, as well as some technical moments of corpora building, arising problems and possible solutions. To conclude, I highlight the limitations of the article and its implications.