I am Alexa, your virtual tutor!: The effects of Amazon Alexa’s text-to-speech voice enthusiasm in a multimedia learning environment

Citation

Liew, Tze Wei and Tan, Su-Mae and Pang, Wei Ming and Khan, Mohammad Tariqul Islam and Kew, Si Na (2023) I am Alexa, your virtual tutor!: The effects of Amazon Alexa’s text-to-speech voice enthusiasm in a multimedia learning environment. Education and Information Technologies, 28 (2). pp. 1455-1489. ISSN 1360-2357

[img] Text
15.pdf - Published Version
Restricted to Repository staff only

Download (989kB)

Abstract

Modern text-to-speech voices can convey social cues ideal for narrating multimedia learning materials. Amazon Alexa has a unique feature among modern text-to-speech vocalizers as she can infuse enthusiasm cues into her synthetic voice. In this first study examining modern text-to-speech voice enthusiasm effects in a multimedia learning environment, a between-subjects online experiment was conducted where learners from a large Asian university (n = 244) listened to either Alexa’s: (1) neutral voice, (2) low-enthusiastic voice, (3) medium-enthusiastic voice, or (4) high-enthusiastic voice, narrating a multimedia lesson on distributed denial-of-service attack. While Alexa’s enthusiastic voices did not enhance persona ratings compared to Alexa’s neutral voice, learners could infer more enthusiasm expressed by Alexa’s medium-and high-enthusiastic voices than Alexa’s neutral voice. Regarding cognitive load, Alexa’s low-and high-enthusiastic voices decreased intrinsic and extraneous cognitive load ratings compared to Alexa’s neutral voice. While Alexa’s enthusiastic voices did not impact affective-motivational ratings differently from Alexa’s neutral voice, learners reported a significant increase of positive emotions from their baseline positive emotions after listening to Alexa’s medium-enthusiastic voice. Finally, Alexa’s enthusiastic voices did not enhance the learning performance on immediate retention and transfer tests compared to Alexa’s neutral voice. This study demonstrates that a modern text-to-speech voice enthusiasm can positively affect learners’ emotions and cognitive load during multimedia learning. Theoretical and practical implications are discussed through the lens of the Cognitive Affective Model of E-learning, Integrated-Cognitive Affective Model of Learning with Multimedia, and Cognitive Load Theory. We further outline this study’s limitations and recommendations for extending and widening the text-to-speech voice emotions research.

Item Type: Article
Uncontrolled Keywords: Multimedia learning, text-to-speech voice, artificial intelligence
Subjects: Q Science > Q Science (General) > Q300-390 Cybernetics
Divisions: Faculty of Business (FOB)
Faculty of Information Science and Technology (FIST)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 14 Sep 2022 01:06
Last Modified: 06 Apr 2023 07:43
URII: http://shdl.mmu.edu.my/id/eprint/10430

Downloads

Downloads per month over past year

View ItemEdit (login required)