Home > Research > Publications & Outputs > Dealing with heterogeneous big data when geopar...
View graph of relations

Dealing with heterogeneous big data when geoparsing historical corpora

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Dealing with heterogeneous big data when geoparsing historical corpora. / Rupp, C. J.; Rayson, Paul; Gregory, Ian et al.
Proceedings of the 2014 IEEE International Conference on Big Data. IEEE, 2014. p. 80-83.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

APA

Vancouver

Rupp CJ, Rayson P, Gregory I, Hardie A, Joulain A, Hartmann D. Dealing with heterogeneous big data when geoparsing historical corpora. In Proceedings of the 2014 IEEE International Conference on Big Data. IEEE. 2014. p. 80-83 doi: 10.1109/BigData.2014.7004457

Author

Rupp, C. J. ; Rayson, Paul ; Gregory, Ian et al. / Dealing with heterogeneous big data when geoparsing historical corpora. Proceedings of the 2014 IEEE International Conference on Big Data. IEEE, 2014. pp. 80-83

Bibtex

@inproceedings{b444226ff4664994b7de22eef585e333,
title = "Dealing with heterogeneous big data when geoparsing historical corpora",
abstract = "It has long been known that {\textquoteleft}variety{\textquoteright} is one of the key challenges and opportunities of big data. This is especially true when we consider the variety of content in historical corpora resulting from large-scale digitisation activities. Collections such as Early English Books Online (EEBO) and the British Library 19th Century Newspapers are extremely large and heterogeneous data sources containing a variety of content in terms of time, location, topic, style and quality. The range of geographical locations referenced in these corpora poses a difficult challenge for state of the art geoparsing tools. In the context of our work on Spatial Humanities analyses, we present our solution for dealing with the variety and scale of these corpora.",
author = "Rupp, {C. J.} and Paul Rayson and Ian Gregory and Andrew Hardie and Amelia Joulain and Daniel Hartmann",
year = "2014",
doi = "10.1109/BigData.2014.7004457",
language = "English",
isbn = "9781479956654",
pages = "80--83",
booktitle = "Proceedings of the 2014 IEEE International Conference on Big Data",
publisher = "IEEE",

}

RIS

TY - GEN

T1 - Dealing with heterogeneous big data when geoparsing historical corpora

AU - Rupp, C. J.

AU - Rayson, Paul

AU - Gregory, Ian

AU - Hardie, Andrew

AU - Joulain, Amelia

AU - Hartmann, Daniel

PY - 2014

Y1 - 2014

N2 - It has long been known that ‘variety’ is one of the key challenges and opportunities of big data. This is especially true when we consider the variety of content in historical corpora resulting from large-scale digitisation activities. Collections such as Early English Books Online (EEBO) and the British Library 19th Century Newspapers are extremely large and heterogeneous data sources containing a variety of content in terms of time, location, topic, style and quality. The range of geographical locations referenced in these corpora poses a difficult challenge for state of the art geoparsing tools. In the context of our work on Spatial Humanities analyses, we present our solution for dealing with the variety and scale of these corpora.

AB - It has long been known that ‘variety’ is one of the key challenges and opportunities of big data. This is especially true when we consider the variety of content in historical corpora resulting from large-scale digitisation activities. Collections such as Early English Books Online (EEBO) and the British Library 19th Century Newspapers are extremely large and heterogeneous data sources containing a variety of content in terms of time, location, topic, style and quality. The range of geographical locations referenced in these corpora poses a difficult challenge for state of the art geoparsing tools. In the context of our work on Spatial Humanities analyses, we present our solution for dealing with the variety and scale of these corpora.

U2 - 10.1109/BigData.2014.7004457

DO - 10.1109/BigData.2014.7004457

M3 - Conference contribution/Paper

SN - 9781479956654

SP - 80

EP - 83

BT - Proceedings of the 2014 IEEE International Conference on Big Data

PB - IEEE

ER -