Dynamic slate recommendation with gated recurrent units and Thompson sampling

School Of Mathematical Sciences

Text available via DOI:

https://doi.org/10.1007/s10618-022-00849-w
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

Bayesian Deep Learning, Multi-Armed Bandits, Recommender Systems, Recurrent Neural Network

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Dynamic slate recommendation with gated recurrent units and Thompson sampling. / Eide, Simen; Leslie, David S.; Frigessi, Arnoldo.
In: Data Mining and Knowledge Discovery, Vol. 36, No. 5, 5, 30.09.2022, p. 1756-1786.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Eide, S, Leslie, DS & Frigessi, A 2022, 'Dynamic slate recommendation with gated recurrent units and Thompson sampling', Data Mining and Knowledge Discovery, vol. 36, no. 5, 5, pp. 1756-1786. https://doi.org/10.1007/s10618-022-00849-w

APA

Eide, S., Leslie, D. S., & Frigessi, A. (2022). Dynamic slate recommendation with gated recurrent units and Thompson sampling. Data Mining and Knowledge Discovery, 36(5), 1756-1786. Article 5. https://doi.org/10.1007/s10618-022-00849-w

Vancouver

Eide S, Leslie DS, Frigessi A. Dynamic slate recommendation with gated recurrent units and Thompson sampling. Data Mining and Knowledge Discovery. 2022 Sept 30;36(5):1756-1786. 5. Epub 2022 Jul 19. doi: 10.1007/s10618-022-00849-w

Author

Eide, Simen ; Leslie, David S. ; Frigessi, Arnoldo. / Dynamic slate recommendation with gated recurrent units and Thompson sampling. In: Data Mining and Knowledge Discovery. 2022 ; Vol. 36, No. 5. pp. 1756-1786.

Bibtex

@article{c202845e5edf4f55af555481d582d022,

title = "Dynamic slate recommendation with gated recurrent units and Thompson sampling",

abstract = "We consider the problem of recommending relevant content to users of an internet platform in the form of lists of items, called slates. We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user, and which scales to real world industrial situations. The recommender system is tested both online on real users, and on an offline dataset collected from a Norwegian web-based marketplace, FINN.no, that is made public for research. This is one of the first publicly available datasets which includes all the slates that are presented to users as well as which items (if any) in the slates were clicked on. Such a data set allows us to move beyond the common assumption that implicitly assumes that users are considering all possible items at each interaction. Instead we build our likelihood using the items that are actually in the slate, and evaluate the strengths and weaknesses of both approaches theoretically and in experiments. We also introduce a hierarchical prior for the item parameters based on group memberships. Both item parameters and user preferences are learned probabilistically. Furthermore, we combine our model with bandit strategies to ensure learning, and introduce {\textquoteleft}in-slate Thompson sampling{\textquoteright} which makes use of the slates to maximise explorative opportunities. We show experimentally that explorative recommender strategies perform on par or above their greedy counterparts. Even without making use of exploration to learn more effectively, click rates increase simply because of improved diversity in the recommended slates.",

keywords = "Bayesian Deep Learning, Multi-Armed Bandits, Recommender Systems, Recurrent Neural Network",

author = "Simen Eide and Leslie, {David S.} and Arnoldo Frigessi",

year = "2022",

month = sep,

day = "30",

doi = "10.1007/s10618-022-00849-w",

language = "English",

volume = "36",

pages = "1756--1786",

journal = "Data Mining and Knowledge Discovery",

issn = "1384-5810",

publisher = "Springer New York LLC",

number = "5",

}

RIS

TY - JOUR

T1 - Dynamic slate recommendation with gated recurrent units and Thompson sampling

AU - Eide, Simen

AU - Leslie, David S.

AU - Frigessi, Arnoldo

PY - 2022/9/30

Y1 - 2022/9/30

N2 - We consider the problem of recommending relevant content to users of an internet platform in the form of lists of items, called slates. We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user, and which scales to real world industrial situations. The recommender system is tested both online on real users, and on an offline dataset collected from a Norwegian web-based marketplace, FINN.no, that is made public for research. This is one of the first publicly available datasets which includes all the slates that are presented to users as well as which items (if any) in the slates were clicked on. Such a data set allows us to move beyond the common assumption that implicitly assumes that users are considering all possible items at each interaction. Instead we build our likelihood using the items that are actually in the slate, and evaluate the strengths and weaknesses of both approaches theoretically and in experiments. We also introduce a hierarchical prior for the item parameters based on group memberships. Both item parameters and user preferences are learned probabilistically. Furthermore, we combine our model with bandit strategies to ensure learning, and introduce ‘in-slate Thompson sampling’ which makes use of the slates to maximise explorative opportunities. We show experimentally that explorative recommender strategies perform on par or above their greedy counterparts. Even without making use of exploration to learn more effectively, click rates increase simply because of improved diversity in the recommended slates.

AB - We consider the problem of recommending relevant content to users of an internet platform in the form of lists of items, called slates. We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user, and which scales to real world industrial situations. The recommender system is tested both online on real users, and on an offline dataset collected from a Norwegian web-based marketplace, FINN.no, that is made public for research. This is one of the first publicly available datasets which includes all the slates that are presented to users as well as which items (if any) in the slates were clicked on. Such a data set allows us to move beyond the common assumption that implicitly assumes that users are considering all possible items at each interaction. Instead we build our likelihood using the items that are actually in the slate, and evaluate the strengths and weaknesses of both approaches theoretically and in experiments. We also introduce a hierarchical prior for the item parameters based on group memberships. Both item parameters and user preferences are learned probabilistically. Furthermore, we combine our model with bandit strategies to ensure learning, and introduce ‘in-slate Thompson sampling’ which makes use of the slates to maximise explorative opportunities. We show experimentally that explorative recommender strategies perform on par or above their greedy counterparts. Even without making use of exploration to learn more effectively, click rates increase simply because of improved diversity in the recommended slates.

KW - Bayesian Deep Learning

KW - Multi-Armed Bandits

KW - Recommender Systems

KW - Recurrent Neural Network

U2 - 10.1007/s10618-022-00849-w

DO - 10.1007/s10618-022-00849-w

M3 - Journal article

AN - SCOPUS:85134486208

VL - 36

SP - 1756

EP - 1786

JO - Data Mining and Knowledge Discovery

JF - Data Mining and Knowledge Discovery

SN - 1384-5810

IS - 5

M1 - 5

ER -

Research

Links

Text available via DOI:

Keywords