lexiDB - Research Portal | Lancaster University

Associated organisational units

Electronic data

lexidb-scalable-corpus
Rights statement: ©2016 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Accepted author manuscript, 147 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Text available via DOI:

https://doi.org/10.1109/BigData.2016.7841062
Final published version

View graph of relations

lexiDB: a scalable corpus database management system

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

More...

Publication date	5/12/2016
Host publication	2016 IEEE International Conference on Big Data (Big Data)
Publisher	IEEE
Pages	3880-3884
Number of pages	5
ISBN (print)	9781467390064
<mark>Original language</mark>	English

Abstract

lexiDB is a scalable corpus database management system designed to fulfill corpus linguistics retrieval queries on multi-billion-word multiply-annotated corpora. It is based on a distributed architecture that allows the system to scale out to support ever larger text collections. This paper presents an overview of the architecture behind lexiDB as well as a demonstration of its functionality. We present lexiDB's performance metrics based on the AWS (Amazon Web Services) infrastructure with two part-of-speech and semantically tagged billion word corpora: Historical Hansard and EEBO (Early English Books Online).

Bibliographic note

©2016 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

lexiDB: a scalable corpus database management system

Abstract

Bibliographic note

Quick Links

Connect With Us

Faculties & Depts

Contact Us