Home > Research > Publications & Outputs > The ParlaMint corpora of parliamentary proceedings

Electronic data

  • s10579-021-09574-0

    Final published version, 2.14 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

The ParlaMint corpora of parliamentary proceedings

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

The ParlaMint corpora of parliamentary proceedings. / Erjavec, Tomaž; Ogrodniczuk, Maciej; Osenova, Petya et al.
In: Language Resources and Evaluation, Vol. 57, No. 1, 31.03.2023, p. 415-448.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Erjavec, T, Ogrodniczuk, M, Osenova, P, Ljubešić, N, Simov, K, Pančur, A, Rudolf, M, Kopp, M, Barkarson, S, Steingrímsson, S, Çöltekin, Ç, de Does, J, Depuydt, K, Agnoloni, T, Venturi, G, Pérez, MC, de Macedo, LD, Navarretta, C, Luxardo, G, Coole, M, Rayson, P, Morkevičius, V, Krilavičius, T, Darǵis, R, Ring, O, van Heusden, R, Marx, M & Fišer, D 2023, 'The ParlaMint corpora of parliamentary proceedings', Language Resources and Evaluation, vol. 57, no. 1, pp. 415-448. https://doi.org/10.1007/s10579-021-09574-0

APA

Erjavec, T., Ogrodniczuk, M., Osenova, P., Ljubešić, N., Simov, K., Pančur, A., Rudolf, M., Kopp, M., Barkarson, S., Steingrímsson, S., Çöltekin, Ç., de Does, J., Depuydt, K., Agnoloni, T., Venturi, G., Pérez, M. C., de Macedo, L. D., Navarretta, C., Luxardo, G., ... Fišer, D. (2023). The ParlaMint corpora of parliamentary proceedings. Language Resources and Evaluation, 57(1), 415-448. https://doi.org/10.1007/s10579-021-09574-0

Vancouver

Erjavec T, Ogrodniczuk M, Osenova P, Ljubešić N, Simov K, Pančur A et al. The ParlaMint corpora of parliamentary proceedings. Language Resources and Evaluation. 2023 Mar 31;57(1):415-448. Epub 2022 Feb 2. doi: 10.1007/s10579-021-09574-0

Author

Erjavec, Tomaž ; Ogrodniczuk, Maciej ; Osenova, Petya et al. / The ParlaMint corpora of parliamentary proceedings. In: Language Resources and Evaluation. 2023 ; Vol. 57, No. 1. pp. 415-448.

Bibtex

@article{3dd8417c716741e9a048454a56391640,
title = "The ParlaMint corpora of parliamentary proceedings",
abstract = "This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project{\textquoteright}s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.",
keywords = "Parliamentary proceedings, Comparable corpora, TEI",
author = "Toma{\v z} Erjavec and Maciej Ogrodniczuk and Petya Osenova and Nikola Ljube{\v s}i{\'c} and Kiril Simov and Andrej Pan{\v c}ur and Micha{\l} Rudolf and Maty{\'a}{\v s} Kopp and Starka{\dh}ur Barkarson and Stein{\th}{\'o}r Steingr{\'i}msson and {\c C}ağrı {\c C}{\"o}ltekin and {de Does}, Jesse and Katrien Depuydt and Tommaso Agnoloni and Giulia Venturi and P{\'e}rez, {Mar{\'i}a Calzada} and {de Macedo}, {Luciana D.} and Costanza Navarretta and Giancarlo Luxardo and Matthew Coole and Paul Rayson and Vaidas Morkevi{\v c}ius and Tomas Krilavi{\v c}ius and Roberts Darǵis and Orsolya Ring and {van Heusden}, Ruben and Maarten Marx and Darja Fi{\v s}er",
year = "2023",
month = mar,
day = "31",
doi = "10.1007/s10579-021-09574-0",
language = "English",
volume = "57",
pages = "415--448",
journal = "Language Resources and Evaluation",
issn = "1574-0218",
publisher = "Springer Netherlands",
number = "1",

}

RIS

TY - JOUR

T1 - The ParlaMint corpora of parliamentary proceedings

AU - Erjavec, Tomaž

AU - Ogrodniczuk, Maciej

AU - Osenova, Petya

AU - Ljubešić, Nikola

AU - Simov, Kiril

AU - Pančur, Andrej

AU - Rudolf, Michał

AU - Kopp, Matyáš

AU - Barkarson, Starkaður

AU - Steingrímsson, Steinþór

AU - Çöltekin, Çağrı

AU - de Does, Jesse

AU - Depuydt, Katrien

AU - Agnoloni, Tommaso

AU - Venturi, Giulia

AU - Pérez, María Calzada

AU - de Macedo, Luciana D.

AU - Navarretta, Costanza

AU - Luxardo, Giancarlo

AU - Coole, Matthew

AU - Rayson, Paul

AU - Morkevičius, Vaidas

AU - Krilavičius, Tomas

AU - Darǵis, Roberts

AU - Ring, Orsolya

AU - van Heusden, Ruben

AU - Marx, Maarten

AU - Fišer, Darja

PY - 2023/3/31

Y1 - 2023/3/31

N2 - This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

AB - This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

KW - Parliamentary proceedings

KW - Comparable corpora

KW - TEI

U2 - 10.1007/s10579-021-09574-0

DO - 10.1007/s10579-021-09574-0

M3 - Journal article

C2 - 35125984

VL - 57

SP - 415

EP - 448

JO - Language Resources and Evaluation

JF - Language Resources and Evaluation

SN - 1574-0218

IS - 1

ER -