Home > Research > Publications & Outputs > Smiling in the Face and Voice of Avatars and Ro...
View graph of relations

Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'. / Torre, Ilaria; Holk, Simon; Yadollahi, Elmira et al.
In: IEEE Transactions on Affective Computing, Vol. 15, No. 2, 31.05.2024, p. 393-404.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Torre, I, Holk, S, Yadollahi, E, Leite, I, McDonnell, R & Harte, N 2024, 'Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'.', IEEE Transactions on Affective Computing, vol. 15, no. 2, pp. 393-404. https://doi.org/10.1109/TAFFC.2022.3213269

APA

Torre, I., Holk, S., Yadollahi, E., Leite, I., McDonnell, R., & Harte, N. (2024). Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'. IEEE Transactions on Affective Computing, 15(2), 393-404. https://doi.org/10.1109/TAFFC.2022.3213269

Vancouver

Torre I, Holk S, Yadollahi E, Leite I, McDonnell R, Harte N. Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'. IEEE Transactions on Affective Computing. 2024 May 31;15(2):393-404. Epub 2022 Oct 10. doi: 10.1109/TAFFC.2022.3213269

Author

Torre, Ilaria ; Holk, Simon ; Yadollahi, Elmira et al. / Smiling in the Face and Voice of Avatars and Robots : Evidence for a 'Smiling McGurk Effect'. In: IEEE Transactions on Affective Computing. 2024 ; Vol. 15, No. 2. pp. 393-404.

Bibtex

@article{d7b8a0cba74a4b7f92fdd5fbc1476335,
title = "Smiling in the Face and Voice of Avatars and Robots: Evidence for a 'Smiling McGurk Effect'.",
abstract = "Multisensory integration influences emotional perception, as the McGurk effect demonstrates for the communication between humans. Human physiology implicitly links the production of visual features with other modes like the audio channel: Face muscles responsible for a smiling face also stretch the vocal cords that result in a characteristic smiling voice. For artificial agents capable of multimodal expression, this linkage is modeled explicitly. In our studies, we observe the influence of visual and audio channels on the perception of the agents{\textquoteright} emotional expression. We created videos of virtual characters and social robots either with matching or mismatching emotional expressions in the audio and visual channels. In two online studies, we measured the agents{\textquoteright} perceived valence and arousal. Our results consistently lend support to the {\textquoteleft}emotional McGurk effect{\textquoteright} hypothesis, according to which face transmits valence information, and voice transmits arousal. When dealing with dynamic virtual characters, visual information is enough to convey both valence and arousal, and thus audio expressivity need not be congruent. When dealing with robots with fixed facial expressions, however, both visual and audio information need to be present to convey the intended expression.",
author = "Ilaria Torre and Simon Holk and Elmira Yadollahi and Iolanda Leite and Rachel McDonnell and Naomi Harte",
year = "2024",
month = may,
day = "31",
doi = "10.1109/TAFFC.2022.3213269",
language = "English",
volume = "15",
pages = "393--404",
journal = "IEEE Transactions on Affective Computing",
number = "2",

}

RIS

TY - JOUR

T1 - Smiling in the Face and Voice of Avatars and Robots

T2 - Evidence for a 'Smiling McGurk Effect'.

AU - Torre, Ilaria

AU - Holk, Simon

AU - Yadollahi, Elmira

AU - Leite, Iolanda

AU - McDonnell, Rachel

AU - Harte, Naomi

PY - 2024/5/31

Y1 - 2024/5/31

N2 - Multisensory integration influences emotional perception, as the McGurk effect demonstrates for the communication between humans. Human physiology implicitly links the production of visual features with other modes like the audio channel: Face muscles responsible for a smiling face also stretch the vocal cords that result in a characteristic smiling voice. For artificial agents capable of multimodal expression, this linkage is modeled explicitly. In our studies, we observe the influence of visual and audio channels on the perception of the agents’ emotional expression. We created videos of virtual characters and social robots either with matching or mismatching emotional expressions in the audio and visual channels. In two online studies, we measured the agents’ perceived valence and arousal. Our results consistently lend support to the ‘emotional McGurk effect’ hypothesis, according to which face transmits valence information, and voice transmits arousal. When dealing with dynamic virtual characters, visual information is enough to convey both valence and arousal, and thus audio expressivity need not be congruent. When dealing with robots with fixed facial expressions, however, both visual and audio information need to be present to convey the intended expression.

AB - Multisensory integration influences emotional perception, as the McGurk effect demonstrates for the communication between humans. Human physiology implicitly links the production of visual features with other modes like the audio channel: Face muscles responsible for a smiling face also stretch the vocal cords that result in a characteristic smiling voice. For artificial agents capable of multimodal expression, this linkage is modeled explicitly. In our studies, we observe the influence of visual and audio channels on the perception of the agents’ emotional expression. We created videos of virtual characters and social robots either with matching or mismatching emotional expressions in the audio and visual channels. In two online studies, we measured the agents’ perceived valence and arousal. Our results consistently lend support to the ‘emotional McGurk effect’ hypothesis, according to which face transmits valence information, and voice transmits arousal. When dealing with dynamic virtual characters, visual information is enough to convey both valence and arousal, and thus audio expressivity need not be congruent. When dealing with robots with fixed facial expressions, however, both visual and audio information need to be present to convey the intended expression.

U2 - 10.1109/TAFFC.2022.3213269

DO - 10.1109/TAFFC.2022.3213269

M3 - Journal article

VL - 15

SP - 393

EP - 404

JO - IEEE Transactions on Affective Computing

JF - IEEE Transactions on Affective Computing

IS - 2

ER -