![]() Within the educational context, text-to-speech vocalizers have generated artificial voices to narrate instructional content in multimedia learning environments. Nass and Brave ( 2005)’s influential book "Wired for speech: How voice activates and advances the human–computer relationship" provides compelling data and discourse asserting that artificial voices automatically evoke a wide array of social responses from listeners, thereby underscoring the importance of designing computer voices to optimize usability and engagement. Of course, artificial voices do not exist merely in science fiction but are pervasive in today’s digital society, although this technology is evolving still. On the other hand, Samantha’s voice exudes emotional cues and expressiveness that mimic a natural human voice, such that the film’s protagonist expresses amazement: "You seem like a person - but you’re just a voice in a computer," and eventually falls in love with the artificial entity. Clarke, and Samantha in the movie "Her." Although HAL9000’s voice is soft, calm, and conversational, its mechanical and emotionless tone induces disquiet and distrust in people. Two iconic, albeit contrasting artificial intelligent voices were conceived and depicted in two acclaimed sci-fi movies, capturing the attention and enriching the imagination of audiences: HAL 9000 in the film "2001: A Space Odyssey" by Stanley Kubrick and Arthur C. ![]() We further outline this study’s limitations and recommendations for extending and widening the text-to-speech voice emotions research. Theoretical and practical implications are discussed through the lens of the Cognitive Affective Model of E-learning, Integrated-Cognitive Affective Model of Learning with Multimedia, and Cognitive Load Theory. This study demonstrates that a modern text-to-speech voice enthusiasm can positively affect learners’ emotions and cognitive load during multimedia learning. Finally, Alexa’s enthusiastic voices did not enhance the learning performance on immediate retention and transfer tests compared to Alexa’s neutral voice. While Alexa’s enthusiastic voices did not impact affective-motivational ratings differently from Alexa’s neutral voice, learners reported a significant increase of positive emotions from their baseline positive emotions after listening to Alexa’s medium-enthusiastic voice. Regarding cognitive load, Alexa’s low-and high-enthusiastic voices decreased intrinsic and extraneous cognitive load ratings compared to Alexa’s neutral voice. While Alexa’s enthusiastic voices did not enhance persona ratings compared to Alexa’s neutral voice, learners could infer more enthusiasm expressed by Alexa’s medium-and high-enthusiastic voices than Alexa’s neutral voice. ![]() In this first study examining modern text-to-speech voice enthusiasm effects in a multimedia learning environment, a between-subjects online experiment was conducted where learners from a large Asian university ( n = 244) listened to either Alexa’s: (1) neutral voice, (2) low-enthusiastic voice, (3) medium-enthusiastic voice, or (4) high-enthusiastic voice, narrating a multimedia lesson on distributed denial-of-service attack. Amazon Alexa has a unique feature among modern text-to-speech vocalizers as she can infuse enthusiasm cues into her synthetic voice. Modern text-to-speech voices can convey social cues ideal for narrating multimedia learning materials.
0 Comments
Leave a Reply. |