Home > Research > Publications & Outputs > Improving first order temporal fact extraction ...

Electronic data

  • 141

    Rights statement: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-50496-4_21

    Accepted author manuscript, 640 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Improving first order temporal fact extraction with unreliable data

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Close
Publication date2/12/2016
Host publicationNatural Language Understanding and Intelligent Applications: 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, Proceedings
EditorsChin-Yew Lin, Nianwen Xue, Dongyan Zhao, Xuanjing Huang, Yansong Feng
Place of PublicationCham
PublisherSpringer
Pages251-262
Number of pages12
ISBN (electronic)9783319504964
ISBN (print)9783319504957
<mark>Original language</mark>English

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume10102
ISSN (Print)0302-9743

Abstract

In this paper, we deal with the task of extracting first order temporal facts from free text. This task is a subtask of relation extraction and it aims at extracting relations between entity and time.
Currently, the field of relation extraction mainly focuses on extracting relations between entities. However, we observe that the multi-granular nature of time expressions can help us divide the dataset constructed by distant supervision to reliable and less reliable subsets, which can help
to improve the extraction results on relations between entity and time.

We accordingly contribute the first dataset focusing on the first order temporal fact extraction task using distant supervision. To fully utilize both the reliable and the less reliable data, we propose to use curriculum learning to rearrange the training procedure, label dropout to make the model be more conservative about less reliable data, and instance attention to help the model distinguish important instances from unimportant ones. Experiments show that these methods help the model outperform the model trained purely on the reliable dataset as well as the model trained on the dataset where all subsets are mixed together.

Bibliographic note

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-50496-4_21