Rights statement: This research was funded, in whole or in part, by the Wellcome Trust, 204475/Z/16/Z. A CC BY or equivalent licence is applied to the AAM arising from this submission, in accordance with the grant’s open access conditions. This is a pre-copy-editing, author-produced PDF of an article accepted for publication in International Journal of Lexicography following peer review. The definitive publisher-authenticated version Sheryl Prentice, Paul Rayson, Jo Knight, Mahmoud El-Haj, Solly Elstein, A Domain Based Approach to Semantic Lexicon Expansion, International Journal of Lexicography, Volume 35, Issue 3, September 2022, Pages 364–377, https://doi.org/10.1093/ijl/ecab028 is available online at: https://academic.oup.com/ijl/article-abstract/35/3/364/6449418
Accepted author manuscript, 252 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - A Domain Based Approach to Semantic Lexicon Expansion
AU - Prentice, Sheryl
AU - Rayson, Paul
AU - Knight, Jo
AU - El-Haj, Mahmoud
AU - Elstein, Solly
N1 - This is a pre-copy-editing, author-produced PDF of an article accepted for publication in International Journal of Lexicography following peer review. The definitive publisher-authenticated version Sheryl Prentice, Paul Rayson, Jo Knight, Mahmoud El-Haj, Solly Elstein, A Domain Based Approach to Semantic Lexicon Expansion, International Journal of Lexicography, Volume 35, Issue 3, September 2022, Pages 364–377, https://doi.org/10.1093/ijl/ecab028 is available online at: https://academic.oup.com/ijl/article-abstract/35/3/364/6449418
PY - 2022/9/30
Y1 - 2022/9/30
N2 - Current approaches to the expansion of semantic lexicons for corpus annotation are somewhat ad hoc in nature and do not generally offer a systematic means of identifying areas for development within one’s lexicon. The present paper sets forward a domain based approach to semantic lexicon expansion, targeting UCREL’s Semantic Analysis System (USAS). First, an updated version of the lexicon is compared to representative corpora to ascertain areas of underrepresentation in a novel method which we call K-FLUX analysis. Second, an example set of underrepresented types are targeted for development using domain specific corpora. Collectively, the results show that some corpora are more successful than others in supplementing the existing USAS lexicon. The paper discusses the various factors that should be borne in mind when utilising the proposed method before concluding with how findings might inform future developments of the lexicon, and crucially, the semantic system on which it is based.
AB - Current approaches to the expansion of semantic lexicons for corpus annotation are somewhat ad hoc in nature and do not generally offer a systematic means of identifying areas for development within one’s lexicon. The present paper sets forward a domain based approach to semantic lexicon expansion, targeting UCREL’s Semantic Analysis System (USAS). First, an updated version of the lexicon is compared to representative corpora to ascertain areas of underrepresentation in a novel method which we call K-FLUX analysis. Second, an example set of underrepresented types are targeted for development using domain specific corpora. Collectively, the results show that some corpora are more successful than others in supplementing the existing USAS lexicon. The paper discusses the various factors that should be borne in mind when utilising the proposed method before concluding with how findings might inform future developments of the lexicon, and crucially, the semantic system on which it is based.
KW - lexicon
KW - expansion
KW - semantic
KW - tagging
KW - domain-specific
U2 - 10.1093/ijl/ecab028
DO - 10.1093/ijl/ecab028
M3 - Journal article
VL - 35
SP - 364
EP - 377
JO - International Journal of Lexicography
JF - International Journal of Lexicography
IS - 3
ER -