Learning to rank under multinomial logit choice

School Of Mathematical Sciences

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Learning to rank under multinomial logit choice. / Grant, J.A.; Leslie, D.S.
In: Journal of Machine Learning Research, Vol. 24, No. 260, 30.10.2023, p. 1-49.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Grant, JA & Leslie, DS 2023, 'Learning to rank under multinomial logit choice', Journal of Machine Learning Research, vol. 24, no. 260, pp. 1-49. <https://jmlr.org/papers/v24/20-981.html>

APA

Grant, J. A., & Leslie, D. S. (2023). Learning to rank under multinomial logit choice. Journal of Machine Learning Research, 24(260), 1-49. https://jmlr.org/papers/v24/20-981.html

Vancouver

Grant JA , Leslie DS. Learning to rank under multinomial logit choice. Journal of Machine Learning Research. 2023 Oct 30;24(260):1-49.

Author

Grant, J.A. ; Leslie, D.S. / Learning to rank under multinomial logit choice. In: Journal of Machine Learning Research. 2023 ; Vol. 24, No. 260. pp. 1-49.

Bibtex

@article{b3f2b12519e84fc0960b405e743bec53,

title = "Learning to rank under multinomial logit choice",

abstract = "Learning the optimal ordering of content is an important challenge in website design. The learning to rank (LTR) framework models this problem as a sequential problem of selecting lists of content and observing where users decide to click. Most previous work on LTR assumes that the user considers each item in the list in isolation, and makes binary choices to click or not on each. We introduce a multinomial logit (MNL) choice model to the LTR framework, which captures the behaviour of users who consider the ordered list of items as a whole and make a single choice among all the items and a no-click option. Under the MNL model, the user favours items which are either inherently more attractive, or placed in a preferable position within the list. We propose upper confidence bound (UCB) algorithms to minimise regret in two settings - where the position dependent parameters are known, and unknown. We present theoretical analysis leading to an Ω(JT−−−√) lower bound for the problem, an O~(JT−−−√) upper bound on regret of the UCB algorithm in the known-parameter setting, and an O~(K2JT−−−√) upper bound on regret, the first, in the more challenging unknown-position-parameter setting. Our analyses are based on tight new concentration results for Geometric random variables, and novel functional inequalities for maximum likelihood estimators computed on discrete data",

author = "J.A. Grant and D.S. Leslie",

year = "2023",

month = oct,

day = "30",

language = "English",

volume = "24",

pages = "1--49",

journal = "Journal of Machine Learning Research",

issn = "1532-4435",

publisher = "Microtome Publishing",

number = "260",

}

RIS

TY - JOUR

T1 - Learning to rank under multinomial logit choice

AU - Grant, J.A.

AU - Leslie, D.S.

PY - 2023/10/30

Y1 - 2023/10/30

N2 - Learning the optimal ordering of content is an important challenge in website design. The learning to rank (LTR) framework models this problem as a sequential problem of selecting lists of content and observing where users decide to click. Most previous work on LTR assumes that the user considers each item in the list in isolation, and makes binary choices to click or not on each. We introduce a multinomial logit (MNL) choice model to the LTR framework, which captures the behaviour of users who consider the ordered list of items as a whole and make a single choice among all the items and a no-click option. Under the MNL model, the user favours items which are either inherently more attractive, or placed in a preferable position within the list. We propose upper confidence bound (UCB) algorithms to minimise regret in two settings - where the position dependent parameters are known, and unknown. We present theoretical analysis leading to an Ω(JT−−−√) lower bound for the problem, an O~(JT−−−√) upper bound on regret of the UCB algorithm in the known-parameter setting, and an O~(K2JT−−−√) upper bound on regret, the first, in the more challenging unknown-position-parameter setting. Our analyses are based on tight new concentration results for Geometric random variables, and novel functional inequalities for maximum likelihood estimators computed on discrete data

AB - Learning the optimal ordering of content is an important challenge in website design. The learning to rank (LTR) framework models this problem as a sequential problem of selecting lists of content and observing where users decide to click. Most previous work on LTR assumes that the user considers each item in the list in isolation, and makes binary choices to click or not on each. We introduce a multinomial logit (MNL) choice model to the LTR framework, which captures the behaviour of users who consider the ordered list of items as a whole and make a single choice among all the items and a no-click option. Under the MNL model, the user favours items which are either inherently more attractive, or placed in a preferable position within the list. We propose upper confidence bound (UCB) algorithms to minimise regret in two settings - where the position dependent parameters are known, and unknown. We present theoretical analysis leading to an Ω(JT−−−√) lower bound for the problem, an O~(JT−−−√) upper bound on regret of the UCB algorithm in the known-parameter setting, and an O~(K2JT−−−√) upper bound on regret, the first, in the more challenging unknown-position-parameter setting. Our analyses are based on tight new concentration results for Geometric random variables, and novel functional inequalities for maximum likelihood estimators computed on discrete data

M3 - Journal article

VL - 24

SP - 1

EP - 49

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

IS - 260

ER -

Research

Links