Improving first order temporal fact extraction with unreliable data

Computing and Communications

Associated organisational units

Electronic data

141
Rights statement: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-50496-4_21
Accepted author manuscript, 640 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Text available via DOI:

https://doi.org/10.1007/978-3-319-50496-4_21
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Bingfeng Luo
Yansong Feng
Zheng Wang
Dongyan Zhao

More...

Publication date	2/12/2016
Host publication	Natural Language Understanding and Intelligent Applications: 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, Proceedings
Editors	Chin-Yew Lin, Nianwen Xue, Dongyan Zhao, Xuanjing Huang, Yansong Feng
Place of Publication	Cham
Publisher	Springer
Pages	251-262
Number of pages	12
ISBN (electronic)	9783319504964
ISBN (print)	9783319504957
<mark>Original language</mark>	English

Publication series

Name	Lecture Notes in Computer Science
Publisher	Springer
Volume	10102
ISSN (Print)	0302-9743

Abstract

In this paper, we deal with the task of extracting first order temporal facts from free text. This task is a subtask of relation extraction and it aims at extracting relations between entity and time.
Currently, the field of relation extraction mainly focuses on extracting relations between entities. However, we observe that the multi-granular nature of time expressions can help us divide the dataset constructed by distant supervision to reliable and less reliable subsets, which can help
to improve the extraction results on relations between entity and time.

We accordingly contribute the first dataset focusing on the first order temporal fact extraction task using distant supervision. To fully utilize both the reliable and the less reliable data, we propose to use curriculum learning to rearrange the training procedure, label dropout to make the model be more conservative about less reliable data, and instance attention to help the model distinguish important instances from unimportant ones. Experiments show that these methods help the model outperform the model trained purely on the reliable dataset as well as the model trained on the dataset where all subsets are mixed together.

Bibliographic note

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-50496-4_21

Research

Associated organisational units

Electronic data

Links

Text available via DOI: