Accepted author manuscript, 32.2 MB, PDF document
Available under license: CC BY-ND: Creative Commons Attribution-NoDerivatives 4.0 International License
Final published version
Licence: CC BY-ND: Creative Commons Attribution-NoDerivatives 4.0 International License
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Developing geographically oriented NLP approaches to sixteenth–century historical documents
T2 - digging into early colonial Mexico
AU - Jiménez Badillo, Diego
AU - Murrieta-Flores, Patricia
AU - Martins, Bruno
AU - Gregory, Ian
AU - Favila-Vázquez, Mariana
AU - Liceras-Garrido, Raquel
PY - 2020/12/31
Y1 - 2020/12/31
N2 - This article introduces an ongoing Digital Humanities project aimed at leveraging the benefits of Natural Language Processing, Corpus Linguistics, Machine Learning, and Spatial Analysis for advancing the computational analysis of vast historical corpora. As a case study, the project concentrates on the Relaciones Geográficas de la Nueva España (1577–1585), one of the key corpora for understanding the early colonial period of Mexico.Using a computer–assisted methodology called Geographical Text Analysis (GTA), the project offers automatic means for parsing historical texts and the markup of words referring both to place names (toponyms) and analytical concepts that are then linked to their geographic locations. Adding geospatial intelligence to the parsing of texts allows exploring hidden geographies and narratives in the historic corpus. The article provides a general overview of the corpus, describes the GTA methodology step by step, and reports on the progress achieved so far.
AB - This article introduces an ongoing Digital Humanities project aimed at leveraging the benefits of Natural Language Processing, Corpus Linguistics, Machine Learning, and Spatial Analysis for advancing the computational analysis of vast historical corpora. As a case study, the project concentrates on the Relaciones Geográficas de la Nueva España (1577–1585), one of the key corpora for understanding the early colonial period of Mexico.Using a computer–assisted methodology called Geographical Text Analysis (GTA), the project offers automatic means for parsing historical texts and the markup of words referring both to place names (toponyms) and analytical concepts that are then linked to their geographic locations. Adding geospatial intelligence to the parsing of texts allows exploring hidden geographies and narratives in the historic corpus. The article provides a general overview of the corpus, describes the GTA methodology step by step, and reports on the progress achieved so far.
KW - Digital Humanities
KW - Natural Language Processing
KW - Machine Learning
KW - Sixteenth-century
KW - Text Analysis
KW - Mexico
KW - Spatial Humanities
KW - Gazetteer
M3 - Journal article
VL - 14
JO - Digital Humanities Quarterly
JF - Digital Humanities Quarterly
SN - 1938-4122
IS - 4
ER -