Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Dealing with heterogeneous big data when geoparsing historical corpora
AU - Rupp, C. J.
AU - Rayson, Paul
AU - Gregory, Ian
AU - Hardie, Andrew
AU - Joulain, Amelia
AU - Hartmann, Daniel
PY - 2014
Y1 - 2014
N2 - It has long been known that ‘variety’ is one of the key challenges and opportunities of big data. This is especially true when we consider the variety of content in historical corpora resulting from large-scale digitisation activities. Collections such as Early English Books Online (EEBO) and the British Library 19th Century Newspapers are extremely large and heterogeneous data sources containing a variety of content in terms of time, location, topic, style and quality. The range of geographical locations referenced in these corpora poses a difficult challenge for state of the art geoparsing tools. In the context of our work on Spatial Humanities analyses, we present our solution for dealing with the variety and scale of these corpora.
AB - It has long been known that ‘variety’ is one of the key challenges and opportunities of big data. This is especially true when we consider the variety of content in historical corpora resulting from large-scale digitisation activities. Collections such as Early English Books Online (EEBO) and the British Library 19th Century Newspapers are extremely large and heterogeneous data sources containing a variety of content in terms of time, location, topic, style and quality. The range of geographical locations referenced in these corpora poses a difficult challenge for state of the art geoparsing tools. In the context of our work on Spatial Humanities analyses, we present our solution for dealing with the variety and scale of these corpora.
U2 - 10.1109/BigData.2014.7004457
DO - 10.1109/BigData.2014.7004457
M3 - Conference contribution/Paper
SN - 9781479956654
SP - 80
EP - 83
BT - Proceedings of the 2014 IEEE International Conference on Big Data
PB - IEEE
ER -