Home > Research > Publications & Outputs > ‘Until it bores me’: Learning Progress Maximiza...

Electronic data

View graph of relations

‘Until it bores me’: Learning Progress Maximization as the Reward Mechanism to solve the Exploration-Exploitation Dilemma in Infants

Research output: Contribution to conference - Without ISBN/ISSN Posterpeer-review

Published

Standard

‘Until it bores me’: Learning Progress Maximization as the Reward Mechanism to solve the Exploration-Exploitation Dilemma in Infants. / Altmann, E. C.; Bazhydai, Marina; Westermann, Gert.

2021. Poster session presented at Development in Motion Conference 2021, .

Research output: Contribution to conference - Without ISBN/ISSN Posterpeer-review

Harvard

APA

Vancouver

Author

Bibtex

@conference{fc37f16593bc440abe634f74c8141544,
title = "{\textquoteleft}Until it bores me{\textquoteright}: Learning Progress Maximization as the Reward Mechanism to solve the Exploration-Exploitation Dilemma in Infants",
abstract = "Infants explore the world to learn about it based on their intrinsically motivated curiosity. However, the mechanisms underlying such exploratory behavior are largely unknown. We propose a new theory in which active learners explore randomly until encountering a familiar entity (e.g. a second stimulus from a previously encountered category) because here, learning is suddenly maximized. Such a category will then be exploited as long as the learning progress is above an individually varying {\textquoteleft}boredom threshold{\textquoteright}; Above this threshold, learning is rewarding – positively reinforcing exploitation. Below this threshold, the learning progress is too small to be rewarding, and they will return to random exploration. The threshold itself can be lowered through inhibition, allowing sustained attention despite smaller learning progress.Here, we will first test this theory in a gaze-contingent eye-tracking task: 10-month-old infants will be introduced to two novel stimulus categories with 30 exemplars each (Fribbles, TarrLab). Two identical “houses” will be presented on a computer screen, and a new exemplar from either category will be revealed when the infant fixates on the corresponding house. This design will enable us to distinguish between exploration – switching from one category to the other – and exploitation – consecutively triggering exemplars from the same category. In follow-on studies we will test older children as well as adults, who will be able to trigger exemplar presentations via key presses. Across age groups, we will measure the number, speed, and sequence of trigger-events, as well as the switches between categories. We hypothesize that if a category was triggered twice it is more likely to be triggered again; the first two triggers establish familiarity and allow for learning which will be rewarding, reinforcing further exploitation. While the length of {\textquoteleft}exploitation-runs{\textquoteright} may differ between participants (representing varying boredom thresholds), constant switching between categories is unlikely as it inhibits maximized learning.",
author = "Altmann, {E. C.} and Marina Bazhydai and Gert Westermann",
year = "2021",
month = jun,
day = "24",
language = "English",
note = "Development in Motion Conference 2021 : Presented by the Marie Curie MOTION network, DevMoCon ; Conference date: 22-06-2021 Through 24-06-2021",
url = "https://www.devmocon2021.com/",

}

RIS

TY - CONF

T1 - ‘Until it bores me’: Learning Progress Maximization as the Reward Mechanism to solve the Exploration-Exploitation Dilemma in Infants

AU - Altmann, E. C.

AU - Bazhydai, Marina

AU - Westermann, Gert

PY - 2021/6/24

Y1 - 2021/6/24

N2 - Infants explore the world to learn about it based on their intrinsically motivated curiosity. However, the mechanisms underlying such exploratory behavior are largely unknown. We propose a new theory in which active learners explore randomly until encountering a familiar entity (e.g. a second stimulus from a previously encountered category) because here, learning is suddenly maximized. Such a category will then be exploited as long as the learning progress is above an individually varying ‘boredom threshold’; Above this threshold, learning is rewarding – positively reinforcing exploitation. Below this threshold, the learning progress is too small to be rewarding, and they will return to random exploration. The threshold itself can be lowered through inhibition, allowing sustained attention despite smaller learning progress.Here, we will first test this theory in a gaze-contingent eye-tracking task: 10-month-old infants will be introduced to two novel stimulus categories with 30 exemplars each (Fribbles, TarrLab). Two identical “houses” will be presented on a computer screen, and a new exemplar from either category will be revealed when the infant fixates on the corresponding house. This design will enable us to distinguish between exploration – switching from one category to the other – and exploitation – consecutively triggering exemplars from the same category. In follow-on studies we will test older children as well as adults, who will be able to trigger exemplar presentations via key presses. Across age groups, we will measure the number, speed, and sequence of trigger-events, as well as the switches between categories. We hypothesize that if a category was triggered twice it is more likely to be triggered again; the first two triggers establish familiarity and allow for learning which will be rewarding, reinforcing further exploitation. While the length of ‘exploitation-runs’ may differ between participants (representing varying boredom thresholds), constant switching between categories is unlikely as it inhibits maximized learning.

AB - Infants explore the world to learn about it based on their intrinsically motivated curiosity. However, the mechanisms underlying such exploratory behavior are largely unknown. We propose a new theory in which active learners explore randomly until encountering a familiar entity (e.g. a second stimulus from a previously encountered category) because here, learning is suddenly maximized. Such a category will then be exploited as long as the learning progress is above an individually varying ‘boredom threshold’; Above this threshold, learning is rewarding – positively reinforcing exploitation. Below this threshold, the learning progress is too small to be rewarding, and they will return to random exploration. The threshold itself can be lowered through inhibition, allowing sustained attention despite smaller learning progress.Here, we will first test this theory in a gaze-contingent eye-tracking task: 10-month-old infants will be introduced to two novel stimulus categories with 30 exemplars each (Fribbles, TarrLab). Two identical “houses” will be presented on a computer screen, and a new exemplar from either category will be revealed when the infant fixates on the corresponding house. This design will enable us to distinguish between exploration – switching from one category to the other – and exploitation – consecutively triggering exemplars from the same category. In follow-on studies we will test older children as well as adults, who will be able to trigger exemplar presentations via key presses. Across age groups, we will measure the number, speed, and sequence of trigger-events, as well as the switches between categories. We hypothesize that if a category was triggered twice it is more likely to be triggered again; the first two triggers establish familiarity and allow for learning which will be rewarding, reinforcing further exploitation. While the length of ‘exploitation-runs’ may differ between participants (representing varying boredom thresholds), constant switching between categories is unlikely as it inhibits maximized learning.

M3 - Poster

T2 - Development in Motion Conference 2021

Y2 - 22 June 2021 through 24 June 2021

ER -