Skip to Main Content
Purpose

To unravel the drivers of service consumers’ parasocial relationships with artificial intelligence-enabled voice assistants (VAs), this study examines how VA frequency- and time-related paralinguistic features affect parasocial attraction of VAs. The authors zoom in on the interrelations between consumers’ social perceptions by exploring how parasocial attraction drives perceived anthropomorphism and trust in VAs.

Design/methodology/approach

In an online experiment, a VA displayed high or low voice intonation and high or low speech rate. Self-reported data of 580 Prolific participants regarding their perceptions of parasocial attraction, anthropomorphism and trust were collected and subjected to partial least squares path modeling.

Findings

The results show a moderating role of VA speech rate on the effect of voice intonation on parasocial attraction, such that voice intonation increases VA parasocial attraction when speech rate is high. In turn, parasocial attraction drives trust in a VA, both directly and indirectly via perceived anthropomorphism.

Research limitations/implications

The study outcomes can help designers and service managers design and infuse VAs in service frontlines within smart service systems, in ways that promise to enhance customer experiences and make services more inclusive.

Originality/value

By addressing the interplay between VAs’ frequency- and time-related paralinguistic features, this study offers new insights into the effects on consumers’ parasocial relationships with VAs and subsequent social perceptions. Such insights can benefit continued research into smart service systems.

Propelled by recent breakthroughs in natural language processing and speech recognition, voice assistants (VAs) powered by artificial intelligence (AI) have burst onto the scene, especially in smart service systems, and have transformed how consumers and employees experience every service encounter (Beverungen et al., 2019; Grewal et al., 2022). Voice-driven interactions enable seamless, hands-free control of smart devices; reflecting the extraordinary promise of such intuitive interfaces (de Barcelos Silva et al., 2020). An estimated 8.4 billion VA devices are in use today, double the number in 2020 (Laricchia, 2024). These transformations also extend beyond in–home smart speakers. Smartphones represent the primary gateways for voice interactions, followed by smart speakers, smart TVs and cars. In turn, the economic impacts of VAs are substantial, with a 2024 global market valued at approximately $50 billion and expected to reach almost $148 billion by 2030 (Ispiryan, 2025).

In response, service firms seek more ways to integrate VAs into their service provision, such as to facilitate ordering at Amazon (Amazon, 2024a) or book travel via Tripadvisor (Tripadvisor, 2024). In addition, ChatGPT incorporates voice features (OpenAI, 2023) and YouTube is rolling out voice replies, such that creators can reply to comments with audio clips (Hutchinson, 2025). In their introduction of such novel uses, leading VA providers, such as Amazon Alexa and Apple Siri, have started offering the option to modify VA’s voice features, e.g. speech rate (Amazon, 2024b; Apple, 2024). From an inclusivity standpoint, expanding personalization of such accessibility features (Mende et al., 2024; Stead et al., 2022) could make these services more inclusive for vulnerable consumers. Indeed, elderly, disabled or low (digitally) literate service consumers could greatly benefit from using VAs to simplify overwhelming portals and processes, by guiding them or performing certain tasks directly (Abdolrahmani et al., 2018; Zoorob et al., 2022). Thus, exploring other voice features could inform the expansion of VA functionalities to make services more accessible for vulnerable groups.

Recent research also underscores the need to study paralinguistic features, such as communication style and speech rate, to strengthen relational outcomes with VAs (van Pinxteren et al., 2020). A primary and distinctive aspect of VAs is their voice, as the main mode of communication in consumer interactions (Grewal et al., 2022; Klaus and Zaichkowsky, 2020). Voice carries paralinguistic characteristics, such as vocal intonation and speech rate, and these prominent vocal cues shape people’s cognitive and behavioral responses (Baird et al., 2017; de Waele et al., 2019; Rodero, 2015). That is, both frequency- and time-related paralinguistic features (Hildebrand et al., 2020) substantively affect people’s social perception of an interlocutor, irrespective of whether that entity is human (Anikin et al., 2018; Imhof, 2010; Smith et al., 1975), a physical robot (Niculescu et al., 2013) or a VA (Efthymiou and Hildebrand, 2020; Wei et al., 2023). Different vocal features interact in this process (Black, 1961; Guyer et al., 2019), though precisely how this interplay unfolds in the context of human-to-VA interactions in smart service systems has yet to be studied.

For example, a critical question for service scholarship is whether and how it might drive trust in smart service systems [1]. Prior research predicts how this long-term, relational outcome (Guenzi et al., 2016) might stem from perceptions of convenience (Malodia et al., 2023), warmth or competence (Dandotiya et al., 2024), but Mele and Russo-Spena (2024) also suggest accounting for relationships and interconnectedness to understand consumer interaction with VAs. When they develop pseudo-social relationships with VAs (Guha et al., 2023; Hernandez-Ortega et al., 2022), people often perceive the non-human entity as human-like (Hernandez-Ortega and Ferreira, 2021; Whang and Im, 2021). In turn, such relationships tend to be associated with higher levels of satisfaction and usage (Han and Yang, 2018) and more positive attitudes and behaviors toward both the non-human entity and company deploying it (Chung and Cho, 2017; Marinova et al., 2017; McLean et al., 2021; Park and Lennon, 2004). However, we lack clear insights into the role of VA parasocial attraction [2], a key manifestation in parasocial relationships (Ashe and McCutcheon, 2001; Kang et al., 2024; Marikyan et al., 2022).

Therefore, we investigate specifically if and how frequency-related and time-related paralinguistic features, in the form of voice intonation and speech rate, shape consumers' parasocial attraction toward VAs. Such parasocial attraction might arguably evoke social perceptions of anthropomorphism [3], which then may foster consumer trust. We apply social agency theory as the primary theoretical lens in our attempts to identify the mechanisms by which vocal cues contribute to the formation of human-like perceptions of VAs and thereby enhance users’ willingness to trust these artificial entities. Taken together, this study makes several important theoretical and managerial contributions.

First, by drawing on social agency theory – which stipulates that when humans communicate with computers, social conversations schemas can be activated by social cues embedded in computer-generated communication (Atkinson et al., 2005; Mayer et al., 2003) – we identify parasocial attraction as an important driver of VA trust in smart service systems, which has relevant implications for relationship dynamics and interconnectedness research.

Second, we move beyond functional design aspects (Blut et al., 2021) and shed new light on the interplay of a VA’s frequency and time-related voice features. As we establish, users are drawn to VAs whose characteristics align with the personal ideals they aspire to embody (Klohnen and Luo, 2003; Wetzel and Insko, 1982); that is, our findings emphasize the importance of this type of perceived alignment. Furthermore, we specify how various paralinguistic cues interact to create a vocal balance (Patterson, 1973; Rodero et al., 2022). Examining how these cues work together, rather than in isolation, uncovers deeper insights into human-technology interaction dynamics within smart service environments.

Third, as a contribution to human–computer interaction research, we also shift the focus from the most appropriate designs for anthropomorphic computer features toward the psychological mechanisms that determine users’ perceptions of those features. Rather than depicting how vocal cues elicit anthropomorphism and preference, we investigate the role of anthropomorphism, as a psychological construct involving the attribution of human-like qualities to non-human entities (Becker-Olsen and Hill, 2006), in prompting trust during human–computer interactions. Fourth, the research insights offer concrete guidelines for managers of firms that design VAs and service managers interested in deploying or optimizing its use in (inclusive) smart service systems.

Organizational frontlines represent the boundaries between the service organization, its customers and other stakeholders (Singh et al., 2017) and facilitate service encounters in which consumers interact with a concrete service interface, which integrates service elements such as human actors, the physical environment, service processes and technology (Larivière et al., 2017; Patrício et al., 2011). The increasing prevalence of the latter element creates frontlines infused with smart technologies (de Keyser et al., 2019; Schultz and Gorlas, 2023; van Doorn et al., 2017), which can provide important benefits to both service providers (e.g. controlling and optimizing product operations, generating novel data streams) and service consumers (e.g. creating value-in-use from using smart technologies; Beverungen et al., 2019; Hottat et al., 2023). At minimum, these service frontlines serve as a resource enabling value creation in the exchange process between service consumers and providers (Akaka and Vargo, 2014). More advanced technologies, however, even can function as autonomous actors in value creation process (de Keyser et al., 2019). For example, a VA is capable of booking an appointment or virtual care consultation without external intervention.

Existing smart service system configurations are mainly characterized by connectivity, which links actors in the service system (Henkens et al., 2021); automation, which enables them to take over tasks from human actors (Mele et al., 2022); and/or dynamic learning and adapting abilities (Mele and Russo-Spena, 2024). As smart service systems reshape traditional service communication (Guha et al., 2023; Mahr and Huh, 2022) across the entire customer journey (Gonçalves et al., 2020; Grewal et al., 2022), VAs come to the forefront, prompting promising benefits but also significant levels of distrust, particularly in contexts that feature sensitive information (Huang et al., 2024; Malodia et al., 2023).

In the shift from human-to-human service encounters to human-to-VA encounters (Larivière et al., 2017), the importance of communicative behaviors is evident (van Pinxteren et al., 2020). Human–computer interaction studies postulate that computers can function as social actors and human users assign human traits to computers (Nass et al., 1994). Therefore, fundamental social principles, as rooted in social psychology, can arise in these service interactions as well (Nass et al., 1995). Accordingly, we adopt social agency theory (Atkinson et al., 2005; Mayer et al., 2003) as the theoretical foundation for this research; it predicts that social conversations schemas get activated by social cues, including those embedded in computer-generated communication. In particular, the voices provided by a VA are rich in social cues (Edwards et al., 2019; Huang et al., 2024), so human users can interpret an interaction with a VA as a social conversation with another social agent. This interpretation, in turn, triggers social rules and associated human-to-human communication schemas, which users apply to form social perceptions of VAs. As complements to social agency theory, we derive insights from communication, social psychology and human–computer interaction literature in the next sections, to establish our conceptual model (Figure 1) and define VA’s frequency- and time-related paralinguistic features, as well as predict their effects on consumers’ parasocial attraction and ultimately trust and related social perception outcomes.

Figure 1
A conceptual model links V A paralinguistic features to parasocial attraction, anthropomorphism, and trust.The diagram shows two section headings at the top labeled “V A paralinguistic features” on the left and “Consumers’ social perceptions” on the right, with the right section enclosed by a dashed rectangular boundary. Under “V A paralinguistic features”, a rectangular box labeled “Intonation” is positioned on the left and connects with a straight rightward arrow to an oval labeled “Parasocial attraction” inside the dashed boundary. Above the arrow, a rectangular box labeled “Speech rate” is positioned above the arrow and links downward with a vertical arrow to the horizontal arrow between “Intonation” and “Parasocial attraction”. Inside the dashed boundary under “Consumers’ social perceptions”, the oval labeled “Parasocial attraction” connects with a diagonal upward rightward arrow to an oval labeled “Anthropomorphism”, and the oval labeled “Anthropomorphism” connects with a diagonal downward rightward arrow to an oval labeled “Trust”. The oval labeled “Parasocial attraction” connects with a rightward arrow to an oval labeled “Trust”.

Conceptual framework. Source: Authors’ own work

Figure 1
A conceptual model links V A paralinguistic features to parasocial attraction, anthropomorphism, and trust.The diagram shows two section headings at the top labeled “V A paralinguistic features” on the left and “Consumers’ social perceptions” on the right, with the right section enclosed by a dashed rectangular boundary. Under “V A paralinguistic features”, a rectangular box labeled “Intonation” is positioned on the left and connects with a straight rightward arrow to an oval labeled “Parasocial attraction” inside the dashed boundary. Above the arrow, a rectangular box labeled “Speech rate” is positioned above the arrow and links downward with a vertical arrow to the horizontal arrow between “Intonation” and “Parasocial attraction”. Inside the dashed boundary under “Consumers’ social perceptions”, the oval labeled “Parasocial attraction” connects with a diagonal upward rightward arrow to an oval labeled “Anthropomorphism”, and the oval labeled “Anthropomorphism” connects with a diagonal downward rightward arrow to an oval labeled “Trust”. The oval labeled “Parasocial attraction” connects with a rightward arrow to an oval labeled “Trust”.

Conceptual framework. Source: Authors’ own work

Close modal

Any voice, whether human or synthesized, carries paralinguistic features to which a listener can prescribe attributes and thus guide their subsequent thoughts, attitudes and behaviors (Baird et al., 2017). These non-linguistic voice characteristics (Abercrombie, 1968; Crystal, 1974) drive social perceptions (Anikin and Persson, 2017; Apple et al., 1979; Guyer et al., 2019; Scherer et al., 1973), even in verbal human-VA communication (e.g. Cohn et al., 2021; Moussalli and Cardoso, 2020). A common categorization of vocal paralinguistic features includes four broad categories: frequency (i.e. intonation), time (i.e. speech rate), amplitude (i.e. loudness of speech) and spectral (i.e. voice instability) features (Hildebrand et al., 2020; Schuller et al., 2013). Research into paralinguistics asserts that frequency and time dimensions are most effective for communicating emotional meaning (e.g. Scherer, 1974). Similarly, when users form perceptions of a VA, we anticipate that its frequency- and time-related vocal features are more relevant than amplitudes or spectral features. Users have direct control over a VA’s amplitude (e.g. Amazon, 2024c), and due to their inherent lack of biological mechanisms (Teixeira et al., 2013) and preprogrammed nature (Guha et al., 2023), VAs produce consistent and stable vocal output. In other words, voice instability is not present in VA communication. Furthermore, prior research on parasocial attraction (Avelino et al., 2020; Chen and Park, 2021; Mariani et al., 2023) has outlined its similarities with interpersonal social attraction (Cialdini, 2009). The effect of vocal features on (social) attractiveness are thus well established across communication (e.g. Burgoon et al., 1990), social psychology (e.g. Robbins and DeNisi, 1994) and human-computer interaction (e.g. Bartneck et al., 2009a; Wagner et al., 2019) literature domains.

VA voice intonation

Voice intonation, or variation of pitch (Hildebrand et al., 2020), drives social perceptions of (social) attraction (Kühne et al., 2020; Niculescu et al., 2013). Average pitch values are around 210 Hz and 120 Hz for female and male voices, respectively (Niculescu et al., 2013). These auditory social cues (Feine et al., 2019) can signal meaning (Bevacqua et al., 2010) and activate socially acquired vocal expression schemata that help listeners infer underlying emotions (Patel and Scherer, 2013). In human-to-human communication schemas, lower pitch variation implies a lack of emotion. Expressing emotion through intonation constitutes an important voice function, such that people tend to favor greater VA pitch variation (Kühne et al., 2020). In turn, we predict that voice intonation should exert a positive effect on VA parasocial attraction in human-to-VA service interactions within a smart service system.

VA speech rate

Speech rate is determined by the number of words a speaker uses in a given timeframe (Hildebrand et al., 2020); it generally equals 150–200 words per minute in normal speech (Ketrow, 1990). It likely predicts users’ social perceptions of VAs (Cohn et al., 2021; Guha et al., 2023), especially their (social) attractiveness (Xie et al., 2023). Speech rate is another auditory social cue (Feine et al., 2019) that should activate human-to-human communication schemas, which users then apply to decode the meanings conveyed by the VA (Bevacqua et al., 2010; Patel and Scherer, 2013). A slower speech tempo tends to evoke negative perceptions, by signaling boredom, sadness and disgust; a high tempo implies enthusiasm and happiness (Scherer et al., 1973). Such perceptions then inform the (social) attractiveness of the speaker (Street Jr. et al., 1983), including VAs (Choi et al., 2020; Dowding et al., 2024). Thus, we predict that speech rate positively influences the parasocial attractiveness of a VA in smart service encounters.

Interplay of VA voice intonation and speech rate

The separate effects of voice intonation and speech rate are relevant, yet communication entails more than one cue at a time (e.g. Black, 1961; Bond and Feldstein, 1982). Relatively less research considers the interplay of different paralinguistic features (Rodero et al., 2022), though a few studies indicate that voice intonation and speech rate exhibit additive effects, such that their combined effect exceeds their sum (de Waele et al., 2019; Guyer et al., 2019; Rodero, 2015). For example, a more dynamic intonation or faster speech rate could attract more attention and improve perceptions in isolation (Gnisci and Pace, 2014; Jackob et al., 2011; Rodero, 2020), but their complex interaction suggests that an optimal combination is needed to enhance parasocial attraction (e.g. Rodero et al., 2022), especially with VAs (e.g. van Pinxteren et al., 2020). According to communication (Megehee et al., 2003) and social psychology (Chattopadhyay et al., 2003; Moore et al., 1986) research, a rapid speech rate tends to decrease the effect of verbal content but increase the effects of peripheral cues, such as pitch. Vocal profiles that score high on expressivity, versus apathy or monotony, also positively shape social perceptions of speakers (Guyer et al., 2019; Rodero, 2020). Because voice intonation and speech rate effectively express emotional meaning (Scherer, 1974), high levels of both could result in an expressive balance (Patterson, 1973; Rodero et al., 2022); these coordinated paralinguistic cues might achieve what we refer to as a “high expressivity” balance.

In addition, people are attracted to others that resemble their ideal self-identity, which comprises traits that they aspire to acquire, improve or express (Klohnen and Luo, 2003; Wetzel and Insko, 1982). If a user proudly possesses a high expressivity vocal profile, a VA that exhibits similar features provides reinforcing input; for users which lack but wish they had this profile, such a VA might attract increased attention, by exhibiting dissimilarity to aspects they do not like about themselves (Klohnen and Mendelsohn, 1998). Therefore, the effect of VA voice intonation on parasocial attraction in smart service encounters should be moderated by speech rate, and this interplay exhibits coordinated effects:

H1.

Speech rate moderates the positive impact of voice intonation on consumers’ parasocial attraction to a VA, such that the impact of voice intonation is greater when the speech rate is high.

Social perceptions of VAs, in which consumers attribute human characteristics to the computer interface (Lea and Spears, 1992), can take many forms. Human–computer interaction literature often focuses on perceptions related to parasocial attraction, anthropomorphism and trust (Bartneck et al., 2009b; Braun et al., 2019; Kühne et al., 2020; Lawson-Guidigbe et al., 2023; Salem et al., 2013; Wagner et al., 2019), though, generally without considering interrelations nor in the context of human-to-VA service interactions within smart service systems. We thus turn to social psychology literature to inform such explorations (Epley et al., 2007; Waytz et al., 2014), including the predicted role of parasocial attraction in driving social perceptions. Social cognition research highlights the importance of social factors for informing social perceptions of others (e.g. Rutherford and Kuhlmeier, 2013), including non-human interlocutors (Pan et al., 2018). When people experience illusory reciprocal human interaction with non-human entities, the parasocial interaction is controlled by the person who imagines it (Horton and Wohl, 1956). The interactions can even grow into artificial friendships, or parasocial relationships, including with VAs (Stern et al., 2007; Whang and Im, 2021), which then might drive key outcomes (Gkinko and Elbanna, 2023; Han and Yang, 2018).

Trust

Reflecting our focus on parasocial relationships, we prioritize trust as a core relational outcome (Blut et al., 2021; Guenzi et al., 2016). First, a lack of trust constitutes a persistent barrier to VA adoption in smart service systems, especially those involving sensitive data (Huang et al., 2024; Malodia et al., 2023), but its presence can encourage consumers to use VAs in the first place (Jain et al., 2022; Moussawi et al., 2021; Siau and Wang, 2018). Second, trust reflects the long-term nature of ongoing relationships (Doney and Cannon, 1997). Third, it is an established proxy for social proximity (Akhavan and Mariotti, 2023), in that it enables people to form imaginary parasocial relationships with non-human entities, which are perceived as real relationships and instrumental to their subjective social experience (Yuan et al., 2016). By interacting with a VA, consumers can build trust (Huang et al., 2024), beyond traditional conceptualizations of technology trust (Chi et al., 2021). The human-like interaction and intelligence provided by VAs broaden the scope of trust to include not only functional (i.e. fulfilling consumer needs) but also social aspects (i.e. following norms related to integrity and benevolence) (Connelly et al., 2018; Hu and Lu, 2021; Huang et al., 2024; Elkins and Derrick, 2013).

The notion that (para)social attraction promotes trust is well-documented in various contexts, including entrepreneurial networks (Ferguson et al., 2016), service employees (Kim, 2019) and VAs (Chen and Park, 2021; Siddike and Kohda, 2019). Compared to other forms of attraction (e.g. task or physical), social attraction offers the strongest predictor of trust in AI-powered entities (Qin et al., 2023). In line with the notion, derived from social agency theory, that VAs can function as a social partner (Mayer et al., 2003), we propose interpreting interactions of users and VAs according to a relationship-building perspective (Huang et al., 2024). If social agency can be established by high expressivity VA vocal profiles and users treat the VA as a social actor, it should activate human-to-human communication schemas (Atkinson et al., 2005), including emotion-as-information schema (Ashtar et al., 2024; Huang et al., 2024), according to which positive attitudes resulting from social attraction with an interlocutor arouse positive beliefs, including trust (Edwards and Cable, 2009). Social attraction also shapes perceptions of communication quality (Beattie et al., 2020; Nasirian et al., 2017), which fosters a trusting sense that the VA understands consumers’ needs (Chen and Park, 2021). Therefore, we propose the following hypothesis:

H2.

Consumers’ parasocial attraction to a VA has a positive effect on perceived trust in a VA.

Anthropomorphism

Many human–computer interaction studies examine the (social) attractiveness of more or less anthropomorphic computer design features (e.g. Qui and Benbasat, 2009; Roesler et al., 2021). Rather than investigating how combinations of vocal cues influence anthropomorphism though, we instead explore a potential mediating role of perceived anthropomorphism in the predicted effect of parasocial attraction on perceived trust, by integrating several research perspectives. First, people interact differently with others who share similar characteristics, so parasocial attraction might increase users’ tendencies to anthropomorphize non-human interlocutors (Epley et al., 2007; Tajfel and Turner, 1986). Second, humans possess basic social needs and have an inherent drive to form social connections. If they can fulfill such needs through parasocial attraction to a VA, they likely apply social scripts that make such interactions more familiar (Blut et al., 2021), and in doing so, they become more likely to anthropomorphize the VA (Chen and Park, 2021; Edwards et al., 2019; Waytz and Epley, 2012; Whang and Im, 2021). Third, a humans’ innate motivation to understand their environment may encourage heightened parasocial attraction to drive VA anthropomorphism (Chen and Park, 2021; Waytz et al., 2010), in that it serves as a compensatory mechanism for enhanced understanding. In line with social agency theory (Mayer et al., 2003; Atkinson et al., 2005), these predictions suggest that the human schemas (Han and Yang, 2018; Xu, 2020) triggered in verbal interactions with VAs result in a positive effect of consumers’ parasocial attraction on their perceived anthropomorphism of a VA.

When consumers perceive a non-human entity as human-like, it becomes more similar to themselves (Go and Sundar, 2019), reducing their social anxiety and evoking a sense of communicating with a real human instead (Hernandez-Ortega and Ferreira, 2021; Yuan et al., 2022). Across various research contexts, pertaining to AI in general (Glikson and Woolley, 2020; Kaplan et al., 2023; Pentina et al., 2023; Troshani et al., 2021), robots (Blut et al., 2021; Natarajan and Gombolay, 2020; van Pinxteren et al., 2019; Wünderlich et al., 2024) and VAs (Chen and Park, 2021; Liu et al., 2024; Rheu et al., 2021; Weitz et al., 2021), anthropomorphism has been shown to drive perceived trust. Anthropomorphizing VAs makes the interactions more personal and engaging (Epley et al., 2008), offers a sense of control over the environment by making the interlocutor appear more predictable (White, 1959) and adheres to social norm expectations (de Visser et al., 2016). Identifying these entities as social actors (Nass et al., 1994) triggers positive impressions (Aggarwal and McGill, 2007; van Doorn et al., 2017) that directly feed into consumers’ available knowledge and evidence to determine whether the VA is reliable to deliver intended promises (Komiak and Benbasat, 2006; Nasirian et al., 2017). Such expectations tend to encourage greater trust, including in VAs in smart service encounters (Huang et al., 2024; Malodia et al., 2023):

H3.

Consumers’ perceived anthropomorphism of a VA mediates the effect of parasocial attraction on perceived trust, such that (a) consumers’ parasocial attraction has a positive effect on perceived anthropomorphism, and (b) their perceived anthropomorphism has a positive effect on consumers’ perceived trust in the VA.

To achieve the research objectives, we conducted an online experimental study, seeking to assess how VA paralinguistic time- and frequency-related features (voice intonation and speech rate) affect consumers’ parasocial attraction, along with subsequent social perception outcomes.

The conceptual model was tested with a 2 (low vs. high voice intonation) x 2 (low vs. high speech rate) between-subjects factorial experimental design. In line with existing service research (e.g. Larivière et al., 2024; Leiño Calleja et al., 2023), we recruited participants via Prolific. A total of 586 North American adults were paid (US) $9.37/hour for their participation and randomly assigned to one of the four conditions. We excluded one participant who failed the attention check, one participant who did not meet the minimum required level of English listening skills and four participants who completed the survey within 120 s – as the average reading speed is around 175 words per minute (Ketrow, 1990) and the voice clip alone lasted 32–35 s [4]. These exclusions left a final sample of 580 participants (53.1% female; 50.3% ≥ 40 years; Mage = 42.20 years; 98.8% native or advanced English listening skills; 88.1% prior experience with VAs).

Participants started by completing an attention check to ensure adequate functionality of sound transmission, in which they had to indicate the word which was verbally expressed in an audio fragment. After being informed that they were about to listen to a voice clip of a VA, participants were exposed to the focal audio fragment. The fragment depicts “Allison” – a fictional female-voiced VA, as most VAs on the market are configured with a female voice by default (Blakemore, 2024). To reinforce the notion that participants are listening to a computer-generated voice and not a human voice, “Allison” explains that she is a VA and briefly discusses capabilities and the potential of VAs. This content was selected as it reflects informational interactions common in everyday use of VAs. Participants could listen to the fragment as many times as needed. In turn, participants completed a set of questions measuring parasocial attraction, perceived anthropomorphism, trust and controls related to AI anxiety, AI privacy concerns, tech-savviness, gender, age, education, level of English listening skills and prior experience with VAs.

The AI anxiety, AI privacy concerns and tech-savviness control measures reflect stable predispositions and traits that could alter the primary mechanisms. First, AI anxiety, defined as fear or unease toward AI technologies/systems (Wang and Wang, 2022), could serve as an inhibiting factor for individuals in terms of engaging with AI systems (Wang et al., 2024). Second, “the perceived threat to an individual’s privacy due to the increased level of information that technology gathers on individuals beyond the individual’s knowledge and sometimes control” (McLean and Osei-Frimpong, 2019, p. 30), referred to as AI privacy concerns, was included as this predisposition has been shown to blur attitudes toward AI systems (Feng et al., 2017). Third, tech-savviness, i.e. an individual’s familiarity and affinity with technology (Ng, 2012), is controlled for as cues are used differently by experts than novices to form perceptions (Guha et al., 2023).

The 32–35 s audio fragment – depending on the experimental condition – was created using IBM Watson’s text-to-speech converter (https://www.ibm.com/products/text-to-speech), which in turn served as input for our experimental stimulus. The speech rate and voice intonation of the audio fragment were manipulated using the software package “Praat”, version 6.1.40 (https://www.fon.hum.uva.nl/praat/). The specific parameters per condition are outlined in Table 1.

Table 1

Parameters audio manipulations

LI-LSLI-HSHI-LSHI-HS
Duration (in s)35.6932.1935.6932.19
Words per minute161.39178.93161.39178.93
f0 mean (Hz)161.27161.23165.39165.56
f0 standard deviation (Hz)17.2017.2332.632.57
f0 relative standard deviation10.66%10.69%19.71%19.67%
f0 range (Hz)120.4141.8184.7200.3

Note(s): HS/LS = high/low speed; HI/LI = high/low intonation; f0 = fundamental frequency of voice

Source(s): Authors’ own work

The variability of pitch (i.e. voice intonation) was manipulated using the coefficient of variation (CV), or relative standard deviation (RSD), defined as the ratio of the standard deviation to the mean (Morgan and Rastatter, 1986). In our study, the RSDs across conditions differ approximately 9%, in line with the eeriness boundaries of speech rate. Figure 2 depicts the f0 (i.e. fundamental voice frequency) variability for the low intonation conditions and the high intonation conditions.

Figure 2
A pair of line charts showing pitch in hertz over time in seconds.The image shows two side-by-side line charts. In both charts, the vertical axis is labeled “Pitch (Hertz)” and ranges from 75 to 300, and the horizontal axis is labeled “Time (seconds)”. In the left chart, the horizontal axis ranges from 0 to 32.19, and a jagged line extends across the full-time range with pitch values ranging approximately from 75 to 218 on the vertical axis. In the right chart, the horizontal axis ranges from 0 to 35.7, and a jagged line extends across the full-time range with pitch values ranging approximately from 100.742 to 283.792 on the vertical axis. Note: All numerical data values are approximated.

Low voice intonation versus high voice intonation: f0 variability. Source: Authors’ own work

Figure 2
A pair of line charts showing pitch in hertz over time in seconds.The image shows two side-by-side line charts. In both charts, the vertical axis is labeled “Pitch (Hertz)” and ranges from 75 to 300, and the horizontal axis is labeled “Time (seconds)”. In the left chart, the horizontal axis ranges from 0 to 32.19, and a jagged line extends across the full-time range with pitch values ranging approximately from 75 to 218 on the vertical axis. In the right chart, the horizontal axis ranges from 0 to 35.7, and a jagged line extends across the full-time range with pitch values ranging approximately from 100.742 to 283.792 on the vertical axis. Note: All numerical data values are approximated.

Low voice intonation versus high voice intonation: f0 variability. Source: Authors’ own work

Close modal

The relative difference between the low speech rate conditions and the high speech rate conditions, and thus the duration, was approximately 10%; with 161.39 and 178.93 words per minute, respectively [5]. These rates have a distinctive speed without causing excessive eeriness, present when approaching extreme values of <150 and >200 (Ketrow, 1990; Street Jr. et al., 1982).

To confirm the effectiveness of the speech rate and voice intonation manipulations within the boundaries of eeriness, a pretest was conducted with 40 North American adults, recruited from MTurk and randomly assigned to either the high speed/high intonation or the low speed/low intonation condition. Participants were instructed to rate the speech rate and the voice intonation on a 10-point semantic differential scale ranging from “very slow” to “very fast”, and “very monotone” to “very energetic”, respectively. The results of an independent samples t-test showed that participants reported significantly higher perceived intonation in the high intonation condition (Mintonationhigh = 4.85, SD = 2.25) compared to the low intonation condition (Mintonationlow = 2.55, SD = 1.28; t (38) = 3.97; p < 0.001). They also noted significantly faster speech rate perceptions in the high (Mspeechratehigh = 5.75, SD = 1.65) versus low (Mspeechratelow = 4.50, SD = 1.32; t (38) = 2.65; p = 0.012) speech rate condition. Thus, the manipulations appear to work as intended [6].

In the main experiment, measurement instruments from extant literature were employed to measure the constructs in our conceptual model (see Table 1). In particular, the measures for parasocial attraction (three items) were adopted from McLean and Osei-Frimpong (2019), perceived anthropomorphism (four items) was based on Bartneck et al. (2009a), perceived trust (twelve items) was taken from Elkins and Derrick (2013), whereas the control measures on AI anxiety (three items), AI privacy concerns (four items) and tech-savviness (two items) were respectively based on work from Wang and Wang (2022), and adopted from McLean and Osei-Frimpong (2019) and Ng (2012). Social attraction and the control variables AI anxiety, AI privacy concerns and tech-savviness were measured using seven-point Likert scales, ranging from “strongly disagree” to “strongly agree”. The remaining constructs relied on five-point semantic differential scales. As noted, we also measured gender, age, education, level of English listening skills, prior experience with VAs.

In line with existing service research (e.g. Choi et al., 2024; Fritze et al., 2020), the proposed conceptual model is assessed with partial least squares structural equation modeling (PLS-SEM), an iterative combination of principal component analysis and ordinary least squares path analysis (Chin, 1998), using the software package SmartPLS 4.0 (Ringle et al., 2024). To generate robust standard errors and t-statistics, the bootstrapping procedure used 10,000 resamples (Hair et al., 2016).

To evaluate the measurement model, we examine its internal reliability, convergent and discriminant validity (see Tables 2 and 3; Hair et al., 2016). First, the composite reliability values for all multi-item constructs – including control variables – ranged from 0.94 to 0.96, exceeding the recommended threshold value of 0.70 (Hair et al., 2011). Second, in support of acceptable convergent validity, all average variance extracted (AVE) values exceed 0.50 (Fornell and Larcker, 1981). Third, discriminant validity was established as the square root of the AVE exceeds the inter-construct correlations for all multi-item constructs (Fornell and Larcker, 1981). In addition, the highest heterotrait-monotrait (HTMT) value is 0.752, which is below the suggested threshold of 0.85 (Henseler et al., 2015; Voorhees et al., 2016).

Table 2

Factor loadings, composite reliability and average variance extracted of the constructs and their items

Components and manifest variablesLoading (t-value)
Parasocial attractionCR: 0.959, AVE: 0.886
I think Allison could be a friend of mine0.932 (113.66)***
I had a good time with Allison0.937 (148.66)***
I would like to spend more time with Allison0.955 (186.07)***
AnthropomorphismCR: 0.945, AVE: 0.812
Please rate your impression of Allison: Fake – Natural0.900 (99.95)***
Please rate your impression of Allison: Machinelike – Humanlike0.906 (89.19)***
Please rate your impression of Allison: Unconscious – Conscious0.869 (67.15)***
Please rate your impression of Allison: Artificial – Lifelike0.929 (120.49)***
TrustCR: 0.951, AVE: 0.620
Please rate your impression of Allison: Undependable – Dependable0.783 (32.00)***
Please rate your impression of Allison: Dishonest – Honest0.774 (39.43)***
Please rate your impression of Allison: Unreliable – Reliable0.824 (51.31)***
Please rate your impression of Allison: Unknowledgeable – Knowledgeable0.802 (42.22)***
Please rate your impression of Allison: Unqualified – Qualified0.829 (49.75)***
Please rate your impression of Allison: Unskilled – Skilled0.795 (38.21)***
Please rate your impression of Allison: Uninformed – Informed0.808 (44.01)***
Please rate your impression of Allison: Incompetent – Competent0.822 (51.40)***
Please rate your impression of Allison: Unfriendly – Friendly0.686 (31.91)***
Please rate your impression of Allison: Uncheerful – Cheerful0.767 (49.17)***
Please rate your impression of Allison: Unkind – Kind0.773 (47.10)***
Please rate your impression of Allison: Unpleasant – Pleasant0.774 (48.68)***
AI anxietyCR: 0.941, AVE: 0.842
I find AI techniques/products (e.g. voice-controlled intelligent personal assistants) scary0.951 (22.19)***
I find AI techniques/products (e.g. voice-controlled intelligent personal assistants) intimidating0.856 (16.10)***
I do not know why, but AI techniques/products (e.g. voice-controlled intelligent personal assistants) scare me0.944 (24.68)***
AI privacy concernsCR: 0.943, AVE: 0.804
I have my doubts about the confidentiality of my interactions with voice-controlled intelligent personal assistants0.884 (76.94)***
I am concerned to perform a financial transaction via voice-controlled intelligent personal assistants0.852 (51.13)***
I am concerned that my personal details stored with voice-controlled intelligent personal assistants could be stolen0.922 (86.76)***
I am concerned that voice-controlled intelligent personal assistants collect too much information about me0.927 (113.48)***
Tech savvinessCR: 0.938, AVE: 0.883
I am constantly being sought after by people for advice on new digital technology0.936 (88.00)***
I am typically one of the first to use new digital technology when it appears0.943 (92.10)***

Note(s): CR: composite reliability; AVE: average variance extracted; ***denotes p < 0.001

Source(s): Authors’ own work

Table 3

Correlations and square root of the average variance extracted

Multi-item construct123456
1. Parasocial attraction0.941     
2. Anthropomorphism0.7000.901    
3. Trust0.6260.6310.787   
4. AI anxiety−0.074−0.021−0.1680.918  
5. AI privacy concerns−0.360−0.390−0.3360.3470.897 
6. Tech savviness0.2930.1700.135−0.154−0.1510.939

Note(s): Values down the diagonal are the square roots of the AVE; all others are correlation coefficients

Source(s): Authors’ own work

Prior to evaluating the structural model and the hypothesized paths, we assess the overall fit of the model. As illustrated in Figure 3, the R2 values for all inner latent constructs range from 0.248 to 0.593, representing medium to large values (Chin, 1998). That is, the R2 values are 0.248, 0.507 and 0.593 for parasocial attraction, perceived trust and anthropomorphism, respectively.

Figure 3
A path model links V A paralinguistic features with parasocial attraction, anthropomorphism, and trust.The diagram shows two section headings at the top labeled “V A paralinguistic features” on the left and “Consumers’ social perceptions” on the right, with the right section enclosed by a dashed rectangular boundary. Under “V A paralinguistic features”, a rectangular box labeled “Intonation” is positioned on the left and connects with a straight rightward arrow to an oval labeled “Parasocial attraction” inside the dashed boundary. Above the arrow, a rectangular box labeled “Speech rate” is positioned above the arrow and links downward with a vertical arrow to the horizontal arrow between “Intonation” and “Parasocial attraction”, and this vertical arrow is labeled “0.295 asterisk”. Inside the dashed boundary under “Consumers’ social perceptions”, the oval labeled “Parasocial attraction” shows the label “R-squared equals 0.248” above it and connects with a diagonal upward rightward arrow labeled “0.597 triple asterisk” to an oval labeled “Anthropomorphism”, which shows the label “R-squared equals 0.593” above it. The oval labeled “Anthropomorphism” connects with a diagonal downward rightward arrow labeled “0.324 triple asterisk” to an oval labeled “Trust”, which shows the label “R-squared equals 0.507” above it. The oval labeled “Parasocial attraction” also connects with a straight rightward arrow to the oval labeled “Trust”, and this arrow is labeled “0.368 triple asterisk”.

Structural model results. Notes: ***denotes p < 0.001, **denotes p < 0.01, *denotes p < 0.05. Source: Authors’ own work

Figure 3
A path model links V A paralinguistic features with parasocial attraction, anthropomorphism, and trust.The diagram shows two section headings at the top labeled “V A paralinguistic features” on the left and “Consumers’ social perceptions” on the right, with the right section enclosed by a dashed rectangular boundary. Under “V A paralinguistic features”, a rectangular box labeled “Intonation” is positioned on the left and connects with a straight rightward arrow to an oval labeled “Parasocial attraction” inside the dashed boundary. Above the arrow, a rectangular box labeled “Speech rate” is positioned above the arrow and links downward with a vertical arrow to the horizontal arrow between “Intonation” and “Parasocial attraction”, and this vertical arrow is labeled “0.295 asterisk”. Inside the dashed boundary under “Consumers’ social perceptions”, the oval labeled “Parasocial attraction” shows the label “R-squared equals 0.248” above it and connects with a diagonal upward rightward arrow labeled “0.597 triple asterisk” to an oval labeled “Anthropomorphism”, which shows the label “R-squared equals 0.593” above it. The oval labeled “Anthropomorphism” connects with a diagonal downward rightward arrow labeled “0.324 triple asterisk” to an oval labeled “Trust”, which shows the label “R-squared equals 0.507” above it. The oval labeled “Parasocial attraction” also connects with a straight rightward arrow to the oval labeled “Trust”, and this arrow is labeled “0.368 triple asterisk”.

Structural model results. Notes: ***denotes p < 0.001, **denotes p < 0.01, *denotes p < 0.05. Source: Authors’ own work

Close modal

Testing the proposed hypotheses [7], the results indicate that H2, H3a and H3b are statistically significant at p < 0.001, and H1 at p < 0.05. Specifically, VA voice intonation (β = 0.039, p = 0.708) and speech rate (β = −0.085, p = 0.396) do not drive parasocial attraction of a VA separately – rather these effects are qualified by a significant two-way interaction effect (β = 0.295, p = 0.042) [8]. As illustrated in Figure 4, parasocial attraction is virtually equal for a VA with high voice intonation (M = 2.84) and with low voice intonation (M = 2.77), when speech rate is low. Conversely, when speech rate is high, VA parasocial attraction is higher when voice intonation is high (M = 3.20) than for VAs with low voice intonation (M = 2.67). Taken together, these results provide support for H1.

Figure 4
A bar chart showing parasocial attraction to a V A by voice intonation and speech rate.The bar chart shows the vertical axis labeled “Parasocial attraction to a V A”, ranging from 0 to 4 in increments of 1 unit, and the horizontal axis labeled “Voice intonation” with two categories from left to right, “Low” and “High”. Each voice intonation category includes two vertical bars with error bars representing speech rate conditions. A legend on the right labels “Speech rate” with light bars representing “Low” and dark bars representing “High”. Under the “Low” voice intonation category, the bar for low speech rate shows a value of 2.77 with upper and lower error bars, and the bar for high speech rate shows a value of 2.67 with upper and lower error bars. Under the “High” voice intonation category, the bar for low speech rate shows a value of 2.84 with upper and lower error bars, and the bar for high speech rate shows a value of 3.20 with upper and lower error bars. A horizontal bracket labeled “0.002 double asterisk” spans across all three bars, covering the high speech rate bars under both “Low” voice intonation and “High” voice intonation, low speech rate bars for “High” voice intonation. Another horizontal bracket labeled “0.037 asterisk” spans only the two bars under the “High” voice intonation category, covering the low speech rate bar and the high speech rate bar.

Two-way interaction effect of VA voice intonation and speech rate on parasocial attraction. Notes: 95% confidence interval error bars; ***denotes p < 0.001, **denotes p < 0.01, *denotes p < 0.05; only significant differences are highlighted. Source: Authors’ own work

Figure 4
A bar chart showing parasocial attraction to a V A by voice intonation and speech rate.The bar chart shows the vertical axis labeled “Parasocial attraction to a V A”, ranging from 0 to 4 in increments of 1 unit, and the horizontal axis labeled “Voice intonation” with two categories from left to right, “Low” and “High”. Each voice intonation category includes two vertical bars with error bars representing speech rate conditions. A legend on the right labels “Speech rate” with light bars representing “Low” and dark bars representing “High”. Under the “Low” voice intonation category, the bar for low speech rate shows a value of 2.77 with upper and lower error bars, and the bar for high speech rate shows a value of 2.67 with upper and lower error bars. Under the “High” voice intonation category, the bar for low speech rate shows a value of 2.84 with upper and lower error bars, and the bar for high speech rate shows a value of 3.20 with upper and lower error bars. A horizontal bracket labeled “0.002 double asterisk” spans across all three bars, covering the high speech rate bars under both “Low” voice intonation and “High” voice intonation, low speech rate bars for “High” voice intonation. Another horizontal bracket labeled “0.037 asterisk” spans only the two bars under the “High” voice intonation category, covering the low speech rate bar and the high speech rate bar.

Two-way interaction effect of VA voice intonation and speech rate on parasocial attraction. Notes: 95% confidence interval error bars; ***denotes p < 0.001, **denotes p < 0.01, *denotes p < 0.05; only significant differences are highlighted. Source: Authors’ own work

Close modal

Furthermore, parasocial attraction of a VA exerts a positive effect on perceived trust (β = 0.368, p < 0.001), in support of H2 [9]. Beyond this direct effect, heightened parasocial attraction drives trust indirectly (indirect effect: β = 0.194, p > 0.001), where parasocial attraction positively influences perceived anthropomorphism (β = 0.597, p < 0.001) [10] and perceived anthropomorphism influences trust (β = 0.324, p < 0.001), in line with H3a and H3b.

Using social agency theory as a dominant theoretical lens, the reported experimental study examined how VAs’ frequency and time-related features shape consumers’ parasocial attraction to a VA, along with subsequent social perception outcomes. The empirical evidence presented in Figures 3 and 4 highlights several key findings.

The results provide empirical support for a moderating role of VA speech rate on the effect of voice intonation on consumers’ parasocial attraction to a VA. When the speech rate is low, parasocial attraction is roughly equal for low and high voice intonation; when it is high, however, parasocial attraction is higher for a VA with high voice intonation. In addition, it was found that consumers’ parasocial attraction directly drives trust in verbal interactions with a VA. Moreover, we found that perceived anthropomorphism mediates this effect, such that parasocial attraction to a VA positively influences perceived anthropomorphism, which in turn enhances perceived trust in a VA.

The current study makes several theoretical contributions. First, by applying social agency theory to clarify the role of parasocial attraction in consumers’ trust formation in smart service encounters, this study moves beyond previous research that prioritizes perceptions of convenience (Malodia et al., 2023), warmth and competence (Dandotiya et al., 2024) as drivers of trust, by devoting particular attention to key manifestations of parasocial relationships (Aw et al., 2022; Kang et al., 2024). In so doing, we respond to Mele and Russo-Spena’s (2024) assertion that adopting a relational ontology approach to smart service systems can offer a more comprehensive perspective that accounts for the significance of relationships and interconnectedness. In detail, we examine and establish how consumers’ parasocial attraction to a VA affects their trust formation, thereby offering new insights into the drivers of trust in voice-based smart service systems.

Second, our reliance on social agency theory as theoretical foundation also underpins the contributions we offer, regarding how a VA’s voice-related features interact to affect parasocial attraction. Rather than functional design aspects, like reliability and responsiveness (Dandotiya et al., 2024), our findings instead reiterate the importance of behavioral design aspects (Blut et al., 2021) for enhancing the overall customer experience in smart service systems. In accordance with predictions that the interplay of verbal behaviors determines the effectiveness of VAs’ communicative behaviors in this context (van Pinxteren et al., 2020), and particularly the interplay of frequency- and time-related paralinguistic features (de Waele et al., 2019; Rodero et al., 2022), we empirically demonstrate how VA voice intonation and speech rate – key vocal behavior characteristics – shape consumers’ parasocial attraction to a VA. Such novel insights into the complex nature and interplay of paralinguistic features (de Waele et al., 2019; Guyer et al., 2019; Rodero, 2015; van Pinxteren et al., 2020; Wetzels et al., 2023) helps deepen our understanding of the relationship dynamics among VA voice features and parasocial attraction in smart service systems.

Third, this study captures how consumers form trust in smart service systems, namely, through perceived anthropomorphism of a VA. Many factors can influence trust in VAs, such as warmth, competence (Dandotiya et al., 2024) and status seeking (Malodia et al., 2023) but the proposed and empirically supported mediation model, which incorporates anthropomorphism, offers additional insights into how perceptions of anthropomorphism that arise in smart service encounters influence trust in smart service systems.

Successfully infusing VAs into smart service systems at service frontlines promises notable benefits for both service consumers and providers (Beverungen et al., 2019; de Keyser et al., 2019; Mahr and Huh, 2022). In particular, parasocial relationships with VAs (Marinova et al., 2017; Mele and Russo-Spena, 2024) can encourage users’ trust, even within service contexts where trust tends to be hard to establish. Concretely, VA designers and service managers can use the findings of this study to enhance their smart service interactions and unlock mutual stakeholder benefits.

Our findings challenge current practices that tend to focus on a subset of vocal cues, in isolation. For example, Amazon’s Alexa offers users the option to modify speech rate, along with an adaptive listening feature. That is, it provides consumers the option to take more time before the VA responds (Amazon, 2024b). Similarly, Apple’s Siri allows users to adjust the VA’s speaking rate and pause time (Apple, 2024). However, to facilitate parasocial human–VA relationships, designers and service managers must account for the interplay between individual vocal cues in VA communicative behaviors. Enabling service consumers to modify multiple VA vocal cues (Cheng, 2023), such as its voice intonation and speech rate, could make VAs more socially attractive. To help achieve a high expressivity balance, service consumers could be presented with vertically stacked sliders, in which, by default, but not limited to, adjusting one slider automatically moves the other slider proportionally. Designs of this nature could nudge consumers to maintain the balance.

The heightened parasocial attraction that likely results from allowing service consumers to tailor VAs within smart service systems, in turn, should carve new pathways for developing parasocial relationships with a VA, by facilitating human-like perceptions and the formation of trust. Vulnerable consumers could especially benefit from such targeted adjustments and expanded functionalities. These inclusive design principles could turn commercial VAs into assistive technologies that could effectively complement an individual’s skills (Masina et al., 2020), without fears of feeling exposed or losing autonomy and dignity associated with the typical assistive technology (Yusif et al., 2016). For instance, navigating portals for healthcare services, filing applications for government aid or managing finances can be overwhelming for elderly, disabled or low (digitally) literate service consumers (Abdolrahmani et al., 2018; Zoorob et al., 2022). Paradoxically, such groups could benefit most from its use. Reducing distrust by allowing these consumers to adjust a VA’s voice features to their liking, enables such smart service systems to simplify processes and guide vulnerable consumers step-by-step or perform (part of) these tasks directly. Unlocking a smart service system’s unique potential for personalization of accessibility features (Mende et al., 2024; Stead et al., 2022) offers clear paths forward to make services more inclusive for vulnerable consumers.

Although the present study offers insights into the role of VAs’ paralinguistic features in shaping consumers’ parasocial relationships with VAs and subsequent social perceptions, it also has some limitations. First, despite the carefully designed experiment in this study, and Prolific’s superior data quality in comparison with university subject pools (Peer et al., 2017) and other online crowdsourcing platforms (Douglas et al., 2023), service consumers are inherently more immersed in real-life service interactions (Leiño Calleja et al., 2023). Continued service research could, therefore, gather field data instead to further corroborate and extend the findings presented herein.

Second, while forming perceptions purely based on voice are driven mainly by paralinguistic features, and less so by linguistic content (Baird et al., 2017), it is possible that the linguistic content of the audio fragment (i.e. capabilities and potential of VAs) used in the present study has exerted some influence on the formation of social perceptions of the VA. Future research could use more neutral linguistic content to still reinforce that participants are listening to a computer-generated voice, though simultaneously limiting potential effects of discussing judgmental VA-related topics.

Third, this paper examines the interplay between VA voice intonation and speech rate. Building further on extant literature in communication (e.g. Burgoon et al., 1990), social psychology (e.g. Robbins and DeNisi, 1994) and human–computer interaction (e.g. Bartneck et al., 2009a; Wagner et al., 2019), future studies could further explore relevant non-vocal characteristics related to the service consumer (e.g. consumer gender; Chang et al., 2018), the context (e.g. type of service interaction or touch point; Hottat et al., 2023) and/or verbal cues (e.g. communication strategy; de Waele et al., 2019) in this particular setting.

Fourth, based on existing human–computer interaction research, this paper zooms in on consumers’ social perception outcomes related to anthropomorphism and VA trust, along with their interrelations. Using our findings as a blueprint, continued research could apply a similar logic and explore additional social perceptions of VAs, e.g. perceived agency, rapport, (emotional) intelligence, animacy, social presence, sociability (Appel et al., 2012; Blut et al., 2021; Gao et al., 2010). These concepts are well-documented in human–computer interaction literature and drawing on fundamental service and marketing principles could help to uncover their interrelationships. Such efforts would be particularly beneficial when moving beyond one-sided communication settings, that is, we encourage future research to further explore the paralinguistic properties of two-sided human-VA interactions.

Fifth, in a related extension, whereas the present study’s main focus is on the social perception outcomes of parasocial attraction to a VA, researchers in this field might explore other parasocial outcomes (e.g. parasocial interaction, parasocial attachment, parasocial identification; Giles, 2002; Rubin et al., 1987; Shan et al., 2020; Stever, 2017), or even performance-related outcomes (e.g. service performance, financial performance; Henkel et al., 2020; Marti et al., 2024) of these constructs. Such research efforts would deepen our understanding of the antecedents and outcomes of parasocial VA mechanisms and of the underlying communicative processes in smart service systems at technology-infused service frontlines. These insights are critical for enhancing service encounters and experiences, across the customer journey, for both consumers and employees in this context.

Sixth, consumers increasingly engage with smart services for highly sensitive tasks, like checking financial account balances (Rao, 2017), conducting monetary transactions (Rao, 2016), applying for official government documents such as passports (Parsons, 2019), and managing their personal health care appointments, laboratory results, or virtual consultations (One Medical, 2023). Despite the technological sophistication and convenience of these services, they continue to evoke substantial consumer distrust (Klaus and Zaichkowsky, 2020) and skepticism, often rooted in concerns about data privacy, security breaches, algorithmic intransparency and perceived impersonality. Considering the nuanced nature of consumer trust and its unique drivers across service contexts, we call for empirical research that investigates task sensitivity as a potential moderating variable.

1.

That is, an individual’s assessment of how much an interlocutor can be trusted (Elkins and Derrick, 2013) and a crucial prerequisite to sustain an interpersonal relationship (O'Connor and Barclay, 2017), as well as parasocial relationships (Hudders and Lou, 2023).

2.

Defined as the ability of an individual or entity to stimulate social interaction (Preece, 2001), and in the context of VAs as the extent to which individuals perceive a VA as a socially attractive communication partner (Lee et al., 2006).

3.

Referring to the act of attributing human-like characteristics to non-human entities such as a computer, or robot (Becker-Olsen and Hill, 2006; Epley et al., 2007), which represents a cornerstone in connecting emotionally with an interlocutor (Blut et al., 2021) and facilitates social comparisons (Festinger, 1954). These processes are critical for developing and maintaining parasocial relationships with non-human entities (Giles, 2002; Klimmt et al., 2013).

4.

As participants were given the option to listen to the fragment as many times as needed, we only excluded participants based on the lower bound of completion duration.

5.

Categorizing speech rate as low or high remains a subjective process, with diverging categorizations across contexts (e.g. Pimsleur et al., 1977; Rodero, 2020; Tauroza and Allison, 1990). In response, we rely on the outer boundaries of causing excessive eeriness (Ketrow, 1990) instead and vary speech rates within this window.

6.

In line with the pre-test, the result of the manipulation checks in the main study demonstrate that participants reported significantly higher perceived intonation in the high intonation conditions (Mintonationhigh = 4.36, SD = 2.39) in comparison to the low intonation conditions (Mintonationlow = 3.15, SD = 2.16; t (578) = 6.37; p < 0.001). In addition, significantly higher levels of perceived speech rate were found in the high speech rate conditions (Mspeechratehigh = 5.66, SD = 1.50) compared to the low speech rate conditions (Mspeechratelow = 4.71, SD = 1.47; t (578) = 7.70; p < 0.001).

7.

Additional analyses revealed that, except for the main effect of VA voice intonation on perceived anthropomorphism (β = 0.262, p = 0.001), no statistically significant effects of VA voice intonation and/or speech rate on perceived anthropomorphism and trust were found.

8.

Of the included controls, AI anxiety (β = 0.081, p = 0.043), AI privacy concerns (β = −0.351, p < 0.001), education (β = 0.184, p = 0.018; baseline: secondary education), level of English listening skills (β = −0.368, p = 0.002; baseline: non-native) and tech-savviness (β = 0.270, p < 0.001) reached statistical significance. For education, the category “None of the above” (N = 7), and for gender, the categories “Non-binary” (N = 6) and “Prefer not to say” (N = 5) were excluded pairwise due to low sample sizes.

9.

From the set of included control variables, the effects of AI anxiety (β = −0.130, p = 0.001) and gender (β = 0.184, p = 0.002; baseline: male) were statistically significant.

10.

Among the included controls, AI privacy concerns (β = −0.221, p < 0.001), age (β = 0.203, p < 0.001) and gender (β = 0.130, p = 0.022; baseline: male) were found to be statistically significant.

Abdolrahmani
,
A.
,
Kuber
,
R.
and
Branham
,
S.M.
(
2018
), “
Siri talks at you” an empirical investigation of voice-activated personal assistant (VAPA) usage by individuals who are blind
”,
Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility
, pp. 
249
-
258
, doi: .
Abercrombie
,
D.
(
1968
), “
Paralanguage
”,
British Journal of Disorders of Communication
, Vol. 
3
No. 
1
, pp. 
55
-
59
, doi: .
Aggarwal
,
P.
and
McGill
,
A.L.
(
2007
), “
Is that car smiling at me? Schema congruity as a basis for evaluating anthropomorphized products
”,
Journal of Consumer Research
, Vol. 
34
No. 
4
, pp. 
468
-
479
, doi: .
Akaka
,
M.A.
and
Vargo
,
S.L.
(
2014
), “
Technology as an operant resource in service (eco) systems
”,
Information Systems and E-Business Management
, Vol. 
12
No. 
3
, pp. 
367
-
384
, doi: .
Akhavan
,
M.
and
Mariotti
,
I.
(
2023
), “
Coworking spaces and well-being: an empirical investigation of coworkers in Italy
”,
Journal of Urban Technology
, Vol. 
30
No. 
1
, pp. 
95
-
109
, doi: .
Amazon
(
2024a
), “
How to shop on Amazon with Alexa
”,
available at:
 https://www.amazon.com/b?ie=UTF8&node=21341308011 (
accessed
 23 December 2024).
Amazon
(
2024b
), “
Alexa, speak slower
”,
available at:
 https://www.amazon.com/b?ie=UTF8&node=21213729011 (
accessed
 12 January 2024).
Amazon
(
2024c
), “
How to adjust the volume of your Echo device
”,
available at:
 https://www.amazon.com/b?ie=UTF8&node=21341304011 (
accessed
 11 December 2024).
Anikin
,
A.
and
Persson
,
T.
(
2017
), “
Nonlinguistic vocalizations from online amateur videos for emotion research: a validated corpus
”,
Behavior Research Methods
, Vol. 
49
No. 
2
, pp. 
758
-
771
, doi: .
Anikin
,
A.
,
Bååth
,
R.
and
Persson
,
T.
(
2018
), “
Human non-linguistic vocal repertoire: call types and their meaning
”,
Journal of Nonverbal Behavior
, Vol. 
42
No. 
1
, pp. 
53
-
80
, doi: .
Appel
,
J.
,
von der Pütten
,
A.
,
Krämer
,
N.C.
and
Gratch
,
J.
(
2012
), “
Does humanity matter? Analyzing the importance of social cues and perceived agency of a computer system for the emergence of social reactions during human-computer interaction
”,
Advances in Human-Computer Interaction
, Vol. 
2012
, pp. 
1
-
10
, doi: .
Apple
(
2024
), “
How to change Siri volume and speaking rate
”,
available at:
 https://support.apple.com/en-sa/105092 (
accessed
 12 January 2024).
Apple
,
W.
,
Streeter
,
L.A.
and
Krauss
,
R.M.
(
1979
), “
Effects of pitch and speech rate on personal attributions
”,
Journal of Personality and Social Psychology
, Vol. 
37
No. 
5
, pp. 
715
-
727
, doi: .
Ashe
,
D.D.
and
McCutcheon
,
L.E.
(
2001
), “
Shyness, loneliness, and attitude toward celebrities
”,
Current Research in Social Psychology
, Vol. 
6
No. 
9
, pp. 
124
-
133
.
Ashtar
,
S.
,
Yom-Tov
,
G.B.
,
Rafaeli
,
A.
and
Wirtz
,
J.
(
2024
), “
Affect-as-information: customer and employee affective displays as expeditious predictors of customer satisfaction
”,
Journal of Service Research
, Vol. 
27
No. 
4
, pp. 
525
-
542
, doi: .
Atkinson
,
R.K.
,
Mayer
,
R.E.
and
Merrill
,
M.M.
(
2005
), “
Fostering social agency in multimedia learning: examining the impact of an animated agent's voice
”,
Contemporary Educational Psychology
, Vol. 
30
No. 
1
, pp. 
117
-
139
, doi: .
Avelino
,
J.
,
Gonçalves
,
A.
,
Ventura
,
R.
,
Garcia-Marques
,
L.
and
Bernardino
,
A.
(
2020
), “
Collecting social signals in constructive and destructive events during human-robot collaborative tasks
”,
Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction
, pp. 
107
-
109
, doi: .
Aw
,
E.C.X.
,
Tan
,
G.W.H.
,
Cham
,
T.H.
,
Raman
,
R.
and
Ooi
,
K.B.
(
2022
), “
Alexa, what’s on my shopping list? Transforming customer experience with digital voice assistants
”,
Technological Forecasting and Social Change
, Vol. 
180
, pp. 1-13, 121711, doi: .
Baird
,
A.
,
Jørgensen
,
S.H.
,
Parada-Cabaleiro
,
E.
,
Hantke
,
S.
,
Cummins
,
N.
and
Schuller
,
B.
(
2017
), “
Perception of paralinguistic traits in synthesized voices
”,
Proceedings of the 12th International Audio Mostly Conference on Augmented and Participatory Sound and Music Experiences
, pp. 
1
-
5
, doi: .
Bartneck
,
C.
,
Kulić
,
D.
,
Croft
,
E.
and
Zoghbi
,
S.
(
2009a
), “
Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots
”,
International Journal of Social Robotics
, Vol. 
1
, pp. 
71
-
81
, doi: .
Bartneck
,
C.
,
Kanda
,
T.
,
Mubin
,
O.
and
Al Mahmud
,
A.
(
2009b
), “
Does the design of a robot influence its animacy and perceived intelligence?
”,
International Journal of Social Robotics
, Vol. 
1
No. 
2
, pp. 
195
-
204
, doi: .
Beattie
,
A.
,
Edwards
,
A.P.
and
Edwards
,
C.
(
2020
), “A bot and a smile: interpersonal impressions of chatbots and humans using emoji in computer-mediated communication,” in
Nah
,
S.
,
McNealy
,
J.E.
,
Kim
 
J.H.
and
Joo
,
J.
(Eds),
Communicating Artificial Intelligence (AI)
,
Routledge
,
New York, NY
, pp.
41
-
59
.
Becker-Olsen
,
K.L.
and
Hill
,
R.P.
(
2006
), “
The impact of sponsor fit on brand equity: the case of nonprofit service providers
”,
Journal of Service Research
, Vol. 
9
No. 
1
, pp. 
73
-
83
, doi: .
Bevacqua
,
E.
,
Pammi
,
S.
,
Hyniewska
,
S.J.
,
Schröder
,
M.
and
Pelachaud
,
C.
(
2010
), “
Multimodal backchannels for embodied conversational agents
,” in
Allbeck
,
J.
,
Badler
,
N.
,
Bickmore
,
T.
,
Pelachaud
,
C.
and
Safonova
,
A.
(Eds),
Intelligent Virtual Agents: 10th International Conference
,
Springer
,
Berlin, Germany
, pp.
194
-
200
.
Beverungen
,
D.
,
Müller
,
O.
,
Matzner
,
M.
,
Mendling
,
J.
and
Vom Brocke
,
J.
(
2019
), “
Conceptualizing smart service systems
”,
Electronic Markets
, Vol. 
29
No. 
1
, pp. 
7
-
18
, doi: .
Black
,
J.W.
(
1961
), “
Relationships among fundamental frequency, vocal sound pressure, and rate of speaking
”,
Language and Speech
, Vol. 
4
No. 
4
, pp. 
196
-
199
, doi: .
Blakemore
,
E.
(
2024
), “
Why do so many virtual assistants have female voices?
”,
available at:
 https://www.nationalgeographic.com/science/article/female-voice-assistants-siri-alexa-woman (
accessed
 10 December 2024).
Blut
,
M.
,
Wang
,
C.
,
Wünderlich
,
N.V.
and
Brock
,
C.
(
2021
), “
Understanding anthropomorphism in service provision: a meta-analysis of physical robots, chatbots, and other AI
”,
Journal of the Academy of Marketing Science
, Vol. 
49
No. 
4
, pp. 
632
-
658
, doi: .
Bond
,
R.N.
and
Feldstein
,
S.
(
1982
), “
Acoustical correlates of the perception of speech rate: an experimental investigation
”,
Journal of Psycholinguistic Research
, Vol. 
11
No. 
6
, pp. 
539
-
557
, doi: .
Braun
,
M.
,
Mainz
,
A.
,
Chadowitz
,
R.
,
Pfleging
,
B.
and
Alt
,
F.
(
2019
), “
At your service: designing voice assistant personalities to improve automotive user interfaces
”,
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
, pp. 
1
-
11
, doi: .
Burgoon
,
J.K.
,
Birk
,
T.
and
Pfau
,
M.
(
1990
), “
Nonverbal behaviors, persuasion, and credibility
”,
Human Communication Research
, Vol. 
17
No. 
1
, pp. 
140
-
169
, doi: .
Chang
,
R.C.S.
,
Lu
,
H.P.
and
Yang
,
P.
(
2018
), “
Stereotypes or golden rules? Exploring likable voice traits of social robots as active aging companions for tech-savvy baby boomers in Taiwan
”,
Computers in Human Behavior
, Vol. 
84
, pp. 
194
-
210
, doi: .
Chattopadhyay
,
A.
,
Dahl
,
D.W.
,
Ritchie
,
R.J.B.
and
Shahin
,
K.N.
(
2003
), “
Hearing voices: the impact of announcer speech characteristics on consumer response to broadcast advertising
”,
Journal of Consumer Psychology
, Vol. 
13
No. 
3
, pp. 
198
-
204
, doi: .
Chen
,
Q.Q.
and
Park
,
H.J.
(
2021
), “
How anthropomorphism affects trust in intelligent personal assistants
”,
Industrial Management and Data Systems
, Vol. 
121
No. 
12
, pp. 
2722
-
2737
, doi: .
Cheng
,
A.N.
(
2023
), “Utilizing different voice value to understand voice assistant users' enjoyment,” in
Stephanidis
,
C.
,
Antona
,
M.
,
Ntoa
,
S.
and
Salvendy
,
G.
(Eds),
HCI International, Communications in Computer and Information Science
,
Springer Nature
,
Switzerland
, pp. 
553
-
560
.
Chi
,
O.H.
,
Jia
,
S.
,
Li
,
Y.
and
Gursoy
,
D.
(
2021
), “
Developing a formative scale to measure consumers’ trust toward interaction with artificially intelligent (AI) social robots in service delivery
”,
Computers in Human Behavior
, Vol. 
118
, pp. 1-17, 106700, doi: .
Chin
,
W.W.
(
1998
), “
The partial least squares approach to structural equation modeling
”,
Modern Methods for Business Research
, Vol. 
295
No. 
2
, pp. 
295
-
336
.
Choi
,
D.
,
Kwak
,
D.
,
Cho
,
M.
and
Lee
,
S.
(
2020
), “
Nobody speaks that fast! An empirical study of speech rate in conversational agents for people with vision impairments
”,
Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
, pp. 
1
-
13
, doi: .
Choi
,
L.
,
Kim
,
M.
and
Kim
,
S.
(
2024
), “
The role of employee empathy in forming brand love: customer delight and gratitude as mediators and power distance belief as a moderator
”,
Journal of Service Management
, Vol. 
35
No. 
3
, pp. 
381
-
407
, doi: .
Chung
,
S.
and
Cho
,
H.
(
2017
), “
Fostering parasocial relationships with celebrities on social media: implications for celebrity endorsement
”,
Psychology and Marketing
, Vol. 
34
No. 
4
, pp. 
481
-
495
, doi: .
Cialdini
,
R.B.
(
2009
),
Influence: Science and Practice
,
Pearson Education
,
Boston, MA
.
Cohn
,
M.
,
Liang
,
K.H.
,
Sarian
,
M.
,
Zellou
,
G.
and
Yu
,
Z.
(
2021
), “
Speech rate adjustments in conversations with an Amazon Alexa socialbot
”,
Frontiers in Communication
, Vol. 
6
, pp. 1-8, 671429, doi: .
Connelly
,
B.L.
,
Crook
,
T.R.
,
Combs
,
J.G.
,
Ketchen
,
D.J.
, Jr
and
Aguinis
,
H.
(
2018
), “
Competence-and integrity-based trust in interorganizational relationships: which matters more?
”,
Journal of Management
, Vol. 
44
No. 
3
, pp. 
919
-
945
, doi: .
Crystal
,
D.
(
1974
), “Paralinguistics”, in
Sebeok
,
T.
(Ed.),
Current Trends in Linguistics
,
Mouton
,
The Hague, The Netherlands
, pp. 
265
-
295
.
Dandotiya
,
G.
,
Gahlot Sarkar
,
J.
and
Sarkar
,
A.
(
2024
), “
Comprehending roles of virtual service assistant's warmth and competence for service co-creation versus service recovery
”,
Journal of Services Marketing
, Vol. 
38
No. 
7
, pp. 
925
-
940
, doi: .
de Barcelos Silva
,
A.
,
Gomes
,
M.M.
,
da Costa
,
C.A.
,
da Rosa Righi
,
R.
,
Barbosa
,
J.L.V.
,
Pessin
,
G.
,
de Doncker
,
G.
and
Federizzi
,
G.
(
2020
), “
Intelligent personal assistants: a systematic literature review
”,
Expert Systems with Applications
, Vol. 
147
, pp. 1-14, 113193, doi: .
de Keyser
,
A.
,
Köcher
,
S.
,
Alkire
,
L.
,
Verbeeck
,
C.
and
Kandampully
,
J.
(
2019
), “
Frontline service technology infusion: conceptual archetypes and future research directions
”,
Journal of Service Management
, Vol. 
30
No. 
1
, pp. 
156
-
183
, doi: .
de Visser
,
E.J.
,
Monfort
,
S.S.
,
McKendrick
,
R.
,
Smith
,
M.A.
,
McKnight
,
P.E.
,
Krueger
,
F.
and
Parasuraman
,
R.
(
2016
), “
Almost human: anthropomorphism increases trust resilience in cognitive agents
”,
Journal of Experimental Psychology: Applied
, Vol. 
22
No. 
3
, pp. 
331
-
349
, doi: .
de Waele
,
A.
,
Claeys
,
A.S.
and
Cauberghe
,
V.
(
2019
), “
The organizational voice: the importance of voice pitch and speech rate in organizational crisis communication
”,
Communication Research
, Vol. 
46
No. 
7
, pp. 
1026
-
1049
, doi: .
Doney
,
P.M.
and
Cannon
,
J.P.
(
1997
), “
An examination of the nature of trust in buyer–seller relationships
”,
Journal of Marketing
, Vol. 
61
No. 
2
, pp. 
35
-
51
, doi: .
Douglas
,
B.D.
,
Ewell
,
P.J.
and
Brauer
,
M.
(
2023
), “
Data quality in online human-subjects research: comparisons between MTurk, Prolific, CloudResearch, Qualtrics, and SONA
”,
PLoS One
, Vol. 
18
No. 
3
, pp. 1-17, e0279720, doi: .
Dowding
,
S.
,
Gutwin
,
C.
and
Cockburn
,
A.
(
2024
), “
User speech rates and preferences for system speech rates
”,
International Journal of Human-Computer Studies
, Vol. 
184
, pp. 1-12, 103222, doi: .
Edwards
,
J.R.
and
Cable
,
D.M.
(
2009
), “
The value of value congruence
”,
Journal of Applied Psychology
, Vol. 
94
No. 
3
, pp. 
654
-
677
, doi: .
Edwards
,
C.
,
Edwards
,
A.
,
Stoll
,
B.
,
Lin
,
X.
and
Massey
,
N.
(
2019
), “
Evaluations of an artificial intelligence instructor's voice: social Identity Theory in human-robot interactions
”,
Computers in Human Behavior
, Vol. 
90
, pp. 
357
-
362
, doi: .
Efthymiou
,
F.
and
Hildebrand
,
C.
(
2020
), “
Morphing vulnerable machines: paralinguistic cues in digital voice assistants shape perceptions of physicality, vulnerability and trust
”,
Proceedings of the European Marketing Academy
, pp. 
1
-
4
.
Elkins
,
A.C.
and
Derrick
,
D.C.
(
2013
), “
The sound of trust: voice as a measurement of trust during interactions with embodied conversational agents
”,
Group Decision and Negotiation
, Vol. 
22
No. 
5
, pp. 
897
-
913
, doi: .
Epley
,
N.
,
Waytz
,
A.
and
Cacioppo
,
J.T.
(
2007
), “
On seeing human: a three-factor theory of anthropomorphism
”,
Psychological Review
, Vol. 
114
No. 
4
, pp. 
864
-
886
, doi: .
Epley
,
N.
,
Waytz
,
A.
,
Akalis
,
S.
and
Cacioppo
,
J.T.
(
2008
), “
When we need a human: motivational determinants of anthropomorphism
”,
Social Cognition
, Vol. 
26
No. 
2
, pp. 
143
-
155
, doi: .
Feine
,
J.
,
Gnewuch
,
U.
,
Morana
,
S.
and
Maedche
,
A.
(
2019
), “
A taxonomy of social cues for conversational agents
”,
International Journal of Human-Computer Studies
, Vol. 
132
, pp. 
138
-
161
, doi: .
Feng
,
H.
,
Fawaz
,
K.
and
Shin
,
K.G.
(
2017
), “
Continuous authentication for voice assistants
”,
Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking
, pp. 
343
-
355
, doi: .
Ferguson
,
R.
,
Schattke
,
K.
and
Paulin
,
M.
(
2016
), “
The social context for value co-creations in an entrepreneurial network: influence of interpersonal attraction, relational norms and partner trustworthiness
”,
International Journal of Entrepreneurial Behavior and Research
, Vol. 
22
No. 
2
, pp. 
199
-
214
, doi: .
Festinger
,
L.
(
1954
), “
A theory of social comparison processes
”,
Human Relations
, Vol. 
7
No. 
2
, pp. 
117
-
140
, doi: .
Fornell
,
C.
and
Larcker
,
D.F.
(
1981
), “
Evaluating structural equation models with unobservable variables and measurement error
”,
Journal of Marketing Research
, Vol. 
18
No. 
1
, pp. 
39
-
50
, doi: .
Fritze
,
M.P.
,
Marchand
,
A.
,
Eisingerich
,
A.B.
and
Benkenstein
,
M.
(
2020
), “
Access-based services as substitutes for material possessions: the role of psychological ownership
”,
Journal of Service Research
, Vol. 
23
No. 
3
, pp. 
368
-
385
, doi: .
Gao
,
Q.
,
Dai
,
Y.
,
Fan
,
Z.
and
Kang
,
R.
(
2010
), “
Understanding factors affecting perceived sociability of social software
”,
Computers in Human Behavior
, Vol. 
26
No. 
6
, pp. 
1846
-
1861
, doi: .
Giles
,
D.C.
(
2002
), “
Parasocial interaction: a review of the literature and a model for future research
”,
Media Psychology
, Vol. 
4
No. 
3
, pp. 
279
-
305
, doi: .
Gkinko
,
L.
and
Elbanna
,
A.
(
2023
), “
Designing trust: the formation of employees’ trust in conversational AI in the digital workplace
”,
Journal of Business Research
, Vol. 
158
, pp. 1-10, 113707, doi: .
Glikson
,
E.
and
Woolley
,
A.W.
(
2020
), “
Human trust in artificial intelligence: review of empirical research
”,
The Academy of Management Annals
, Vol. 
14
No. 
2
, pp. 
627
-
660
, doi: .
Gnisci
,
A.
and
Pace
,
A.
(
2014
), “
The effects of hand gestures on psychosocial perception: a preliminary study
”,
Recent Advances of Neural Network Models and Applications: Proceedings of the 23rd Workshop of the Italian Neural Networks Society (SIREN)
, pp. 
305
-
314
, doi: .
Go
,
E.
and
Sundar
,
S.S.
(
2019
), “
Humanizing chatbots: the effects of visual, identity and conversational cues on humanness perceptions
”,
Computers in Human Behavior
, Vol. 
97
, pp. 
304
-
316
, doi: .
Gonçalves
,
L.
,
Patrício
,
L.
,
Grenha Teixeira
,
J.
and
Wuenderlich
,
N.V.
(
2020
), “
Understanding the customer experience with smart services
”,
Journal of Service Management
, Vol. 
31
No. 
4
, pp. 
723
-
744
, doi: .
Grewal
,
D.
,
Guha
,
A.
,
Schweiger
,
E.
,
Ludwig
,
S.
and
Wetzels
,
M.
(
2022
), “
How communications by AI-enabled voice assistants impact the customer journey
”,
Journal of Service Management
, Vol. 
33
Nos
4/5
, pp. 
705
-
720
, doi: .
Guenzi
,
P.
,
de Luca
,
L.M.
and
Spiro
,
R.
(
2016
), “
The combined effect of customer perceptions about a salesperson's adaptive selling and selling orientation on customer trust in the salesperson: a contingency perspective
”,
Journal of Business and Industrial Marketing
, Vol. 
31
No. 
4
, pp. 
553
-
564
, doi: .
Guha
,
A.
,
Bressgott
,
T.
,
Grewal
,
D.
,
Mahr
,
D.
,
Wetzels
,
M.
and
Schweiger
,
E.
(
2023
), “
How artificiality and intelligence affect voice assistant evaluations
”,
Journal of the Academy of Marketing Science
, Vol. 
51
No. 
4
, pp. 
843
-
866
, doi: .
Guyer
,
J.J.
,
Fabrigar
,
L.R.
and
Vaughan-Johnston
,
T.I.
(
2019
), “
Speech rate, intonation, and pitch: investigating the bias and cue effects of vocal confidence on persuasion
”,
Personality and Social Psychology Bulletin
, Vol. 
45
No. 
3
, pp. 
389
-
405
, doi: .
Hair
,
J.F.
,
Ringle
,
C.M.
and
Sarstedt
,
M.
(
2011
), “
PLS-SEM: indeed a silver bullet
”,
Journal of Marketing Theory and Practice
, Vol. 
19
No. 
2
, pp. 
139
-
152
, doi: .
Hair
,
J.F.
,
Hult
,
G.T.M.
,
Ringle
,
C.
and
Sarstedt
,
M.
(
2016
),
A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM)
,
Sage Publications
,
Thousand Oaks
.
Han
,
S.
and
Yang
,
H.
(
2018
), “
Understanding adoption of intelligent personal assistants: a parasocial relationship perspective
”,
Industrial Management and Data Systems
, Vol. 
118
No. 
3
, pp. 
618
-
636
, doi: .
Henkel
,
A.P.
,
Bromuri
,
S.
,
Iren
,
D.
and
Urovi
,
V.
(
2020
), “
Half human, half machine–augmenting service employees with AI for interpersonal emotion regulation
”,
Journal of Service Management
, Vol. 
31
No. 
2
, pp. 
247
-
265
, doi: .
Henkens
,
B.
,
Verleye
,
K.
and
Lariviere
,
B.
(
2021
), “
The smarter, the better?! Customer well-being, engagement, and perceptions in smart service systems
”,
International Journal of Research in Marketing
, Vol. 
38
No. 
2
, pp. 
425
-
447
, doi: .
Henseler
,
J.
,
Ringle
,
C.M.
and
Sarstedt
,
M.
(
2015
), “
A new criterion for assessing discriminant validity in variance-based structural equation modeling
”,
Journal of the Academy of Marketing Science
, Vol. 
43
No. 
1
, pp. 
115
-
135
, doi: .
Hernandez-Ortega
,
B.
,
Aldas-Manzano
,
J.
and
Ferreira
,
I.
(
2022
), “
Relational cohesion between users and smart voice assistants
”,
Journal of Services Marketing
, Vol. 
36
No. 
5
, pp. 
725
-
740
, doi: .
Hernandez‐Ortega
,
B.
and
Ferreira
,
I.
(
2021
), “
How smart experiences build service loyalty: the importance of consumer love for smart voice assistants
”,
Psychology and Marketing
, Vol. 
38
No. 
7
, pp. 
1122
-
1139
, doi: .
Hildebrand
,
C.
,
Efthymiou
,
F.
,
Busquet
,
F.
,
Hampton
,
W.H.
,
Hoffman
,
D.L.
and
Novak
,
T.P.
(
2020
), “
Voice analytics in business research: conceptual foundations, acoustic feature extraction, and applications
”,
Journal of Business Research
, Vol. 
121
, pp. 
364
-
374
, doi: .
Horton
,
D.
and
Richard Wohl
,
R.
(
1956
), “
Mass communication and para-social interaction: observations on intimacy at a distance
”,
Psychiatry
, Vol. 
19
No. 
3
, pp. 
215
-
229
, doi: .
Hottat
,
E.
,
Leroi-Werelds
,
S.
and
Streukens
,
S.
(
2023
), “
To automate or not to automate? A contingency approach to service automation
”,
Journal of Service Management
, Vol. 
34
No. 
4
, pp. 
696
-
724
, doi: .
Hu
,
P.
,
Lu
,
Y.
and
Gong
,
Y.Y.
(
2021
), “
Dual humanness and trust in conversational AI: a person-centered approach
”,
Computers in Human Behavior
, Vol. 
119
, pp. 1-18, 106727, doi: .
Huang
,
R.
,
Kim
,
M.
and
Lennon
,
S.
(
2024
), “
Voice-based personal assistant (VPA) trust: investigating competence and integrity
”,
Telematics and Informatics Reports
, Vol. 
14
, pp. 1-12, 100140, doi: .
Hudders
,
L.
and
Lou
,
C.
(
2023
), “
The rosy world of influencer marketing? Its bright and dark sides, and future research recommendations
”,
International Journal of Advertising
, Vol. 
42
No. 
1
, pp. 
151
-
161
, doi: .
Hutchinson
,
A.
(
2025
), “
YouTube expands voice replies, adds shorts quiz sticker
”,
available at:
 https://www.socialmediatoday.com/news/youtube-adds-shorts-quiz-sticker-expands-audio-replies/748677/ (
accessed
 21 May 2025).
Imhof
,
M.
(
2010
), “
Listening to voices and judging people
”,
International Journal of Listening
, Vol. 
24
No. 
1
, pp. 
19
-
33
, doi: .
Ispiryan
,
D.
(
2025
), “
The rise of voice search: optimizing pay-per-click ads
”,
available at:
 https://www.forbes.com/councils/forbesagencycouncil/2025/04/10/the-rise-of-voice-search-optimizing-pay-per-click-ads/ (
accessed
 21 May 2025).
Jackob
,
N.
,
Roessing
,
T.
and
Petersen
,
T.
(
2011
), “
The effects of verbal and nonverbal elements in persuasive communication: findings from two multi-method experiments
”,
Communications
, Vol. 
36
No. 
2
, pp. 
245
-
271
, doi: .
Jain
,
S.
,
Basu
,
S.
,
Dwivedi
,
Y.K.
and
Kaur
,
S.
(
2022
), “
Interactive voice assistants–does brand credibility assuage privacy risks?
”,
Journal of Business Research
, Vol. 
139
, pp. 
701
-
717
, doi: .
Kang
,
W.
,
Shao
,
B.
,
Du
,
S.
,
Chen
,
H.
and
Zhang
,
Y.
(
2024
), “
How to improve voice assistant evaluations: understanding the role of attachment with a socio-technical systems perspective
”,
Technological Forecasting and Social Change
, Vol. 
200
, pp. 1-20, 123171, doi: .
Kaplan
,
A.D.
,
Kessler
,
T.T.
,
Brill
,
J.C.
and
Hancock
,
P.A.
(
2023
), “
Trust in artificial intelligence: meta-analytic findings
”,
Human Factors
, Vol. 
65
No. 
2
, pp. 
337
-
359
, doi: .
Ketrow
,
S.M.
(
1990
), “
Attributes of a telemarketer's voice and persuasiveness. A review and synthesis of the literature
”,
Journal of Direct Marketing
, Vol. 
4
No. 
3
, pp. 
7
-
21
, doi: .
Kim
,
Y.K.
(
2019
), “
The effects of attractiveness of service employee's on interpersonal trust, satisfaction and loyalty
”,
The Journal of Industrial Distribution and Business
, Vol. 
10
No. 
10
, pp. 
23
-
34
, doi: .
Klaus
,
P.
and
Zaichkowsky
,
J.
(
2020
), “
AI voice bots: a services marketing research agenda
”,
Journal of Services Marketing
, Vol. 
34
No. 
3
, pp. 
389
-
398
, doi: .
Klimmt
,
C.
,
Hartmann
,
T.
and
Schramm
,
H.
(
2013
), “Parasocial interactions and relationships”, in
Psychology of Entertainment
, pp. 
291
-
313
.
Klohnen
,
E.C.
and
Luo
,
S.
(
2003
), “
Interpersonal attraction and personality: what is attractive--self similarity, ideal similarity, complementarity or attachment security?
”,
Journal of Personality and Social Psychology
, Vol. 
85
No. 
4
, pp. 
709
-
722
, doi: .
Klohnen
,
E.C.
and
Mendelsohn
,
G.A.
(
1998
), “
Partner selection for personality characteristics: a couple-centered approach
”,
Personality and Social Psychology Bulletin
, Vol. 
24
No. 
3
, pp. 
268
-
278
, doi: .
Komiak
,
S.Y.
and
Benbasat
,
I.
(
2006
), “
The effects of personalization and familiarity on trust and adoption of recommendation agents
”,
MIS Quarterly
, Vol. 
30
No. 
4
, pp. 
941
-
960
, doi: .
Kühne
,
K.
,
Fischer
,
M.H.
and
Zhou
,
Y.
(
2020
), “
The human takes it all: humanlike synthesized voices are perceived as less eerie and more likable. evidence from a subjective ratings study
”,
Frontiers in Neurorobotics
, Vol. 
14
, pp. 1-15, 593732, doi: .
Laricchia
,
F.
(
2024
), “
Number of digital voice assistants in use worldwide 2019-2024
”,
available at:
 https://www.statista.com/statistics/973815/worldwide-digital-voice-assistant-in-use/ (
accessed
 21 May 2025).
Larivière
,
B.
,
Bowen
,
D.
,
Andreassen
,
T.W.
,
Kunz
,
W.
,
Sirianni
,
N.J.
,
Voss
,
C.
,
Wünderlich
,
N.V.
and
de Keyser
,
A.
(
2017
), “
Service encounter 2.0: an investigation into the roles of technology, employees and customers
”,
Journal of Business Research
, Vol. 
79
, pp. 
238
-
246
, doi: .
Larivière
,
B.
,
Verleye
,
K.
,
de Keyser
,
A.
,
Koerten
,
K.
and
Schmidt
,
A.L.
(
2024
), “
The service robot customer experience (SR-CX): a matter of AI intelligences and customer service goals
”,
Journal of Service Research
, Vol. 
28
No. 
1
, pp.
35
-
56
.
Lawson-Guidigbe
,
C.
,
Louveton
,
N.
,
Amokrane-Ferka
,
K.
,
Le Blanc
,
B.
and
André
,
J.M.
(
2023
), “
Embodying a virtual agent in a self-driving car: a survey-based study on user perceptions of trust, likeability, and anthropomorphism
”,
International Journal of Mobile Human Computer Interaction
, Vol. 
15
No. 
1
, pp. 
1
-
18
, doi: .
Lea
,
M.
and
Spears
,
R.
(
1992
), “
Paralanguage and social perception in computer‐mediated communication
”,
Journal of Organizational Computing and Electronic Commerce
, Vol. 
2
Nos
3-4
, pp. 
321
-
341
, doi: .
Lee
,
K.M.
,
Peng
,
W.
,
Jin
,
S.A.
and
Yan
,
C.
(
2006
), “
Can robots manifest personality?: an empirical test of personality recognition, social responses, and social presence in human–robot interaction
”,
Journal of Communication
, Vol. 
56
No. 
4
, pp. 
754
-
772
, doi: .
Leiño Calleja
,
D.
,
Schepers
,
J.
and
Nijssen
,
E.J.
(
2023
), “
Some agents are more similar than others: customer orientation of frontline robots and employees
”,
Journal of Service Management
, Vol. 
34
No. 
6
, pp. 
27
-
49
, doi: .
Liu
,
W.
,
Jiang
,
M.
,
Li
,
W.
and
Mou
,
J.
(
2024
), “
How does the anthropomorphism of AI chatbots facilitate users’ reuse intention in online health consultation services? The moderating role of disease severity
”,
Technological Forecasting and Social Change
, Vol. 
203
, pp. 1-18, 123407, doi: .
Mahr
,
D.
and
Huh
,
J.
(
2022
), “
Technologies in service communication: looking forward
”,
Journal of Service Management
, Vol. 
33
Nos
4/5
, pp. 
648
-
656
, doi: .
Malodia
,
S.
,
Ferraris
,
A.
,
Sakashita
,
M.
,
Dhir
,
A.
and
Gavurova
,
B.
(
2023
), “
Can Alexa serve customers better? AI-driven voice assistant service interactions
”,
Journal of Services Marketing
, Vol. 
37
No. 
1
, pp. 
25
-
39
, doi: .
Mariani
,
M.M.
,
Hashemi
,
N.
and
Wirtz
,
J.
(
2023
), “
Artificial intelligence empowered conversational agents: a systematic literature review and research agenda
”,
Journal of Business Research
, Vol. 
161
, pp. 1-23, 113838, doi: .
Marikyan
,
D.
,
Papagiannidis
,
S.
,
Rana
,
O.F.
,
Ranjan
,
R.
and
Morgan
,
G.
(
2022
), “
Alexa, let's talk about my productivity: the impact of digital assistants on work productivity
”,
Journal of Business Research
, Vol. 
142
, pp. 
572
-
584
, doi: .
Marinova
,
D.
,
de Ruyter
,
K.
,
Huang
,
M.H.
,
Meuter
,
M.L.
and
Challagalla
,
G.
(
2017
), “
Getting smart: learning from technology-empowered frontline interactions
”,
Journal of Service Research
, Vol. 
20
No. 
1
, pp. 
29
-
42
, doi: .
Marti
,
C.L.
,
Liu
,
H.
,
Kour
,
G.
,
Bilgihan
,
A.
and
Xu
,
Y.
(
2024
), “
Leveraging artificial intelligence in firm-generated online customer communities: a framework and future research agenda
”,
Journal of Service Management
, Vol. 
35
No. 
3
, pp. 
438
-
458
, doi: .
Masina
,
F.
,
Orso
,
V.
,
Pluchino
,
P.
,
Dainese
,
G.
,
Volpato
,
S.
,
Nelini
,
C.
,
Mapelli
,
D.
,
Spagnolli
,
A.
and
Gamberini
,
L.
(
2020
), “
Investigating the accessibility of voice assistants with impaired users: mixed methods study
”,
Journal of Medical Internet Research
, Vol. 
22
No. 
9
, pp. 1-12, e18431, doi: .
Mayer
,
R.E.
,
Sobko
,
K.
and
Mautone
,
P.D.
(
2003
), “
Social cues in multimedia learning: role of speaker's voice
”,
Journal of Educational Psychology
, Vol. 
95
No. 
2
, pp. 
419
-
425
, doi: .
McLean
,
G.
and
Osei-Frimpong
,
K.
(
2019
), “
Hey Alexa… Examine the variables influencing the use of artificial intelligent in-home voice assistants
”,
Computers in Human Behavior
, Vol. 
99
, pp. 
28
-
37
, doi: .
McLean
,
G.
,
Osei-Frimpong
,
K.
and
Barhorst
,
J.
(
2021
), “
Alexa, do voice assistants influence consumer brand engagement?–Examining the role of AI powered voice assistants in influencing consumer brand engagement
”,
Journal of Business Research
, Vol. 
124
, pp. 
312
-
328
, doi: .
Megehee
,
C.M.
,
Dobie
,
K.
and
Grant
,
J.
(
2003
), “
Time versus pause manipulation in communications directed to the young adult population: does it matter?
”,
Journal of Advertising Research
, Vol. 
43
No. 
3
, pp. 
281
-
292
, doi: .
Mele
,
C.
and
Russo-Spena
,
T.
(
2024
), “
Agencement of onlife and phygital: smart tech–enabled value co-creation practices
”,
Journal of Service Management
, Vol. 36 No. 2, pp. 217-240, doi: .
Mele
,
C.
,
Russo Spena
,
T.
and
Kaartemo
,
V.
(
2022
), “Smart technologies in service provision and experience”, in
Edvardsson
,
B.
and
Tronvoll
,
B.
(Eds),
The Palgrave Handbook of Service Management
,
Palgrave Macmillan
,
London, UK
, pp.
887
-
906
.
Mende
,
M.
,
Scott
,
M.L.
,
Ubal
,
V.O.
,
Hassler
,
C.M.
,
Harmeling
,
C.M.
and
Palmatier
,
R.W.
(
2024
), “
Personalized communication as a platform for service inclusion? Initial insights into interpersonal and AI-based personalization for stigmatized consumers
”,
Journal of Service Research
, Vol. 
27
No. 
1
, pp. 
28
-
48
, doi: .
Moore
,
D.L.
,
Hausknecht
,
D.
and
Thamodaran
,
K.
(
1986
), “
Time compression, response opportunity, and persuasion
”,
Journal of Consumer Research
, Vol. 
13
No. 
1
, pp. 
85
-
99
, doi: .
Morgan
,
E.E.
and
Rastatter
,
M.
(
1986
), “
Variability of voice fundamental frequency in elderly female speakers
”,
Perceptual and Motor Skills
, Vol. 
63
No. 
1
, pp. 
215
-
218
, doi: .
Moussalli
,
S.
and
Cardoso
,
W.
(
2020
), “
Intelligent personal assistants: can they understand and be understood by accented L2 learners?
”,
Computer Assisted Language Learning
, Vol. 
33
No. 
8
, pp. 
865
-
890
, doi: .
Moussawi
,
S.
,
Koufaris
,
M.
and
Benbunan-Fich
,
R.
(
2021
), “
How perceptions of intelligence and anthropomorphism affect adoption of personal intelligent agents
”,
Electronic Markets
, Vol. 
31
No. 
2
, pp. 
343
-
364
, doi: .
Nasirian
,
F.
,
Ahmadian
,
M.
and
Lee
,
O.K.D.
(
2017
), “
AI-based voice assistant systems: evaluating from the interaction and trust perspectives
”,
Proceedings of Americas Conference on Information Systems
, pp. 
1
-
10
.
Nass
,
C.
,
Steuer
,
J.
and
Tauber
,
E.R.
(
1994
), “
Computers are social actors
”,
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
, pp. 
72
-
78
, doi: .
Nass
,
C.
,
Moon
,
Y.
,
Fogg
,
B.J.
,
Reeves
,
B.
and
Dryer
,
C.
(
1995
), “
Can computer personalities be human personalities?
”,
Conference Companion on Human Factors in Computing Systems
, pp. 
228
-
229
, doi: .
Natarajan
,
M.
and
Gombolay
,
M.
(
2020
), “
Effects of anthropomorphism and accountability on trust in human robot interaction
”,
Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction
, pp. 
33
-
42
, doi: .
Ng
,
W.
(
2012
), “
Can we teach digital natives digital literacy?
”,
Computers and Education
, Vol. 
59
No. 
3
, pp. 
1065
-
1078
, doi: .
Niculescu
,
A.
,
van Dijk
,
B.
,
Nijholt
,
A.
,
Li
,
H.
and
See
,
S.L.
(
2013
), “
Making social robots more attractive: the effects of voice pitch, humor and empathy
”,
International Journal of Social Robotics
, Vol. 
5
No. 
2
, pp. 
171
-
191
, doi: .
O'Connor
,
J.
and
Barclay
,
P.
(
2017
), “
The influence of voice pitch on perceptions of trustworthiness across social contexts
”,
Evolution and Human Behavior
, Vol. 
38
No. 
4
, pp. 
506
-
512
, doi: .
One Medical
(
2023
), “
One Medical joins Amazon to make it easier for people to get and stay healthier
”,
available at:
 https://www.onemedical.com/mediacenter/one-medical-joins-amazon/(
accessed
 23 December 2024).
OpenAI
(
2023
), “
ChatGPT can now see, hear, and speak
”,
available at:
 https://openai.com/index/chatgpt-can-now-see-hear-and-speak/ (
accessed
 21 May 2025).
Pan
,
M.K.
,
Croft
,
E.A.
and
Niemeyer
,
G.
(
2018
), “
Evaluating social perception of human-to-robot handovers using the robot social attributes scale (ROSAS)
”,
Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction
, pp. 
443
-
451
, doi: .
Park
,
J.H.
and
Lennon
,
S.J.
(
2004
), “
Television apparel shopping: impulse buying and parasocial interaction
”,
Clothing and Textiles Research Journal
, Vol. 
22
No. 
3
, pp. 
135
-
144
, doi: .
Parsons
,
J.
(
2019
), “
You can now apply for a passport by asking Alexa or Google
”,
available at:
 https://metro.co.uk/2019/04/20/you-can-now-apply-for-a-passport-by-asking-alexa-or-google-9269556/(
accessed
 23 December 2024).
Patel
,
S.
and
Scherer
,
K.R.
(
2013
), “Vocal behavior,” in
Hall
,
J.A.
and
Knapp
,
M.L.
(Eds),
Nonverbal Communication
,
de Gruyter Mouton
,
Berlin, Germany
, pp.
167
-
204
.
Patrício
,
L.
,
Fisk
,
R.P.
,
Falcão e Cunha
,
J.
and
Constantine
,
L.
(
2011
), “
Multilevel service design: from customer value constellation to service experience blueprinting
”,
Journal of Service Research
, Vol. 
14
No. 
2
, pp. 
180
-
200
, doi: .
Patterson
,
M.L.
(
1973
), “
Compensation in nonverbal immediacy behaviors: a review
”,
Sociometry
, Vol. 
36
No. 
2
, pp. 
237
-
252
, doi: .
Peer
,
E.
,
Brandimarte
,
L.
,
Samat
,
S.
and
Acquisti
,
A.
(
2017
), “
Beyond the Turk: alternative platforms for crowdsourcing behavioral research
”,
Journal of Experimental Social Psychology
, Vol. 
70
, pp. 
153
-
163
, doi: .
Pentina
,
I.
,
Xie
,
T.
,
Hancock
,
T.
and
Bailey
,
A.
(
2023
), “
Consumer–machine relationships in the age of artificial intelligence: systematic literature review and research directions
”,
Psychology and Marketing
, Vol. 
40
No. 
8
, pp. 
1593
-
1614
, doi: .
Pimsleur
,
P.
 
Hancock
,
C.
and
Furey
,
P.
(
1977
), “Speech rate and listening comprehension,” in
Burt
,
M.
,
Dulay
,
H.
and
Finocchairo
(Eds),
Viewpoints on English as a Second Language
,
Regents
,
New York
.
Preece
,
J.
(
2001
), “
Sociability and usability in online communities: determining and measuring success
”,
Behaviour and Information Technology
, Vol. 
20
No. 
5
, pp. 
347
-
356
, doi: .
Qin
,
M.
,
Li
,
S.
,
Zhu
,
W.
and
Qiu
,
S.
(
2023
), “
Trust in service robot: the role of appearance anthropomorphism
”,
Current Issues in Tourism
, Vol. 
28
, pp. 
1
-
19
, doi: .
Qiu
,
L.
and
Benbasat
,
I.
(
2009
), “
Evaluating anthropomorphic product recommendation agents: a social relationship perspective to designing information systems
”,
Journal of Management Information Systems
, Vol. 
25
No. 
4
, pp. 
145
-
182
, doi: .
Rao
,
L.
(
2016
), “
You can tell Apple's Siri to send money via PayPal
”,
available at:
 https://fortune.com/2016/11/10/siri-paypal/ (
accessed
 23 December 2024).
Rao
,
L.
(
2017
), “
American Express debuts its first Amazon Alexa skill
”,
available at:
 https://fortune.com/2017/05/11/american-express-alexa-skill/ (
accessed
 23 December 2024).
Rheu
,
M.
,
Shin
,
J.Y.
,
Peng
,
W.
and
Huh-Yoo
,
J.
(
2021
), “
Systematic review: trust-building factors and implications for conversational agent design
”,
International Journal of Human-Computer Interaction
, Vol. 
37
No. 
1
, pp. 
81
-
96
, doi: .
Ringle
,
C.M.
,
Wende
,
S.
and
Becker
,
J.M.
(
2024
),
SmartPLS 4
,
SmartPLS
,
Boenningstedt
,
available at:
 https://www.smartpls.com/ (
accessed
 12 January 2024).
Robbins
,
T.L.
and
DeNisi
,
A.S.
(
1994
), “
A closer look at interpersonal affect as a distinct influence on cognitive processing in performance evaluations
”,
Journal of Applied Psychology
, Vol. 
79
No. 
3
, pp. 
341
-
353
, doi: .
Rodero
,
E.
(
2015
), “
The principle of distinctive and contrastive coherence of prosody in radio news: an analysis of perception and recognition
”,
Journal of Nonverbal Behavior
, Vol. 
39
No. 
1
, pp. 
79
-
92
, doi: .
Rodero
,
E.
(
2020
), “
Do your ads talk too fast to your audio audience?: how speech rates of audio commercials influence cognitive and physiological outcomes
”,
Journal of Advertising Research
, Vol. 
60
No. 
3
, pp. 
337
-
349
, doi: .
Rodero
,
E.
,
Larrea
,
O.
,
Rodríguez-de-Dios
,
I.
and
Lucas
,
I.
(
2022
), “
The expressive balance effect: perception and physiological responses of prosody and gestures
”,
Journal of Language and Social Psychology
, Vol. 
41
No. 
6
, pp. 
659
-
684
, doi: .
Roesler
,
E.
,
Manzey
,
D.
and
Onnasch
,
L.
(
2021
), “
A meta-analysis on the effectiveness of anthropomorphism in human-robot interaction
”,
Science Robotics
, Vol. 
6
No. 
58
, pp. 1-10, eabj5425, doi: .
Rubin
,
R.B.
and
McHugh
,
M.P.
(
1987
), “
Development of parasocial interaction relationships
”,
Journal of Broadcasting and Electronic Media
, Vol. 
31
No. 
3
, pp. 
279
-
292
, doi: .
Rutherford
,
M.D.
and
Kuhlmeier
,
V.A.
(
2013
),
Social Perception: Detection and Interpretation of Animacy, Agency, and Intention
,
MIT Press
.
Salem
,
M.
,
Eyssel
,
F.
,
Rohlfing
,
K.
,
Kopp
,
S.
and
Joublin
,
F.
(
2013
), “
To err is human (-like): effects of robot gesture on perceived anthropomorphism and likability
”,
International Journal of Social Robotics
, Vol. 
5
No. 
3
, pp. 
313
-
323
, doi: .
Scherer
,
K.R.
(
1974
), “Acoustic concomitants of emotional dimensions: judging affect from synthesized tone sequences”, in
Weitz
,
S.
(Ed.),
Nonverbal Communication
,
Oxford University Press
,
New York
, pp. 
249
-
253
.
Scherer
,
K.R.
,
London
,
H.
and
Wolf
,
J.J.
(
1973
), “
The voice of confidence: paralinguistic cues and audience evaluation
”,
Journal of Research in Personality
, Vol. 
7
No. 
1
, pp. 
31
-
44
, doi: .
Schuller
,
B.
,
Steidl
,
S.
,
Batliner
,
A.
,
Burkhardt
,
F.
,
Devillers
,
L.
,
Müller
,
C.
and
Narayanan
,
S.
(
2013
), “
Paralinguistics in speech and language—state-of-the-art and the challenge
”,
Computer Speech and Language
, Vol. 
27
No. 
1
, pp. 
4
-
39
, doi: .
Schultz
,
C.D.
and
Gorlas
,
B.
(
2023
), “
Magic mirror on the wall: cross-buying at the point of sale
”,
Electronic Commerce Research
, Vol. 
23
No. 
3
, pp. 
1677
-
1700
, doi: .
Shan
,
Y.
,
Chen
,
K.J.
and
Lin
,
J.S.
(
2020
), “
When social media influencers endorse brands: the effects of self-influencer congruence, parasocial identification, and perceived endorser motive
”,
International Journal of Advertising
, Vol. 
39
No. 
5
, pp. 
590
-
610
, doi: .
Siau
,
K.
and
Wang
,
W.
(
2018
), “
Building trust in artificial intelligence, machine learning, and robotics
”,
Cutter Business Technology Journal
, Vol. 
31
No. 
2
, pp. 
47
-
53
.
Siddike
,
M.A.
and
Kohda
,
Y.
(
2019
), “
Trust in cognitive assistants: a theoretical framework
”,
International Journal of Applied Industrial Engineering
, Vol. 
6
No. 
1
, pp. 
60
-
71
, doi: .
Singh
,
J.
,
Brady
,
M.
,
Arnold
,
T.
and
Brown
,
T.
(
2017
), “
The emergent field of organizational frontlines
”,
Journal of Service Research
, Vol. 
20
No. 
1
, pp. 
3
-
11
, doi: .
Smith
,
B.L.
,
Brown
,
B.L.
,
Strong
,
W.J.
and
Rencher
,
A.C.
(
1975
), “
Effects of speech rate on personality perception
”,
Language and Speech
, Vol. 
18
No. 
2
, pp. 
145
-
152
, doi: .
Stead
,
S.
,
Wetzels
,
R.
,
Wetzels
,
M.
,
Odekerken-Schröder
,
G.
and
Mahr
,
D.
(
2022
), “
Toward multisensory customer experiences: a cross-disciplinary bibliometric review and future research directions
”,
Journal of Service Research
, Vol. 
25
No. 
3
, pp. 
440
-
459
, doi: .
Stern
,
B.B.
,
Russell
,
C.A.
and
Russell
,
D.W.
(
2007
), “
Hidden persuasions in soap operas: damaged heroines and negative consumer effects
”,
International Journal of Advertising
, Vol. 
26
No. 
1
, pp. 
9
-
36
, doi: .
Stever
,
G.S.
(
2017
), “
Evolutionary theory and reactions to mass media: understanding parasocial attachment
”,
Psychology of Popular Media Culture
, Vol. 
6
No. 
2
, pp. 
95
-
102
, doi: .
Street
,
Jr., R.L.
and
Brady
,
R.M.
(
1982
), “
Speech rate acceptance ranges as a function of evaluative domain, listener speech rate, and communication context
”,
Communication Monographs
, Vol. 
49
No. 
4
, pp. 
290
-
308
, doi: .
Street
,
Jr., R.L.
,
Brady
,
R.M.
and
Putman
,
W.B.
(
1983
), “
The influence of speech rate stereotypes and rate similarity or listeners' evaluations of speakers
”,
Journal of Language and Social Psychology
, Vol. 
2
No. 
1
, pp. 
37
-
56
, doi: .
Tajfel
,
H.
and
Turner
,
J.C.
(
1986
), “The social identity theory of intergroup behavior,” in
Austin
,
W.G.
and
Worchel
,
S.
(Eds),
Psychology of Intergroup Relation
,
Hall
,
Chicago, IL, USA
, pp.
7
-
24
.
Tauroza
,
S.
and
Allison
,
D.
(
1990
), “
Speech rates in British English
”,
Applied Linguistics
, Vol. 
11
No. 
1
, pp. 
90
-
105
, doi: .
Teixeira
,
J.P.
,
Oliveira
,
C.
and
Lopes
,
C.
(
2013
), “
Vocal acoustic analysis–jitter, shimmer and HNR parameters
”,
Procedia Technology
, Vol. 
9
, pp. 
1112
-
1122
, doi: .
Tripadvisor
(
2024
), “
Viator teams up with Amazon Alexa to bring 300,000+ memorable travel experiences to hotel guests
”,
available at:
 https://tripadvisor.mediaroom.com/2024-06-25-Viator-Teams-Up-With-Amazon-Alexa-to-Bring-300,000-Memorable-Travel-Experiences-to-Hotel-Guests (
accessed
 23 December 2024).
Troshani
,
I.
,
Rao Hill
,
S.
,
Sherman
,
C.
and
Arthur
,
D.
(
2021
), “
Do we trust in AI? Role of anthropomorphism and intelligence
”,
Journal of Computer Information Systems
, Vol. 
61
No. 
5
, pp. 
481
-
491
, doi: .
van Doorn
,
J.
,
Mende
,
M.
,
Noble
,
S.M.
,
Hulland
,
J.
,
Ostrom
,
A.L.
,
Grewal
,
D.
and
Petersen
,
J.A.
(
2017
), “
Domo arigato Mr. Roboto: emergence of automated social presence in organizational frontlines and customers' service experiences
”,
Journal of Service Research
, Vol. 
20
No. 
1
, pp. 
43
-
58
, doi: .
van Pinxteren
,
M.M.
,
Wetzels
,
R.W.
,
Rüger
,
J.
,
Pluymaekers
,
M.
and
Wetzels
,
M.
(
2019
), “
Trust in humanoid robots: implications for services marketing
”,
Journal of Services Marketing
, Vol. 
33
No. 
4
, pp. 
507
-
518
, doi: .
van Pinxteren
,
M.M.
,
Pluymaekers
,
M.
and
Lemmink
,
J.G.
(
2020
), “
Human-like communication in conversational agents: a literature review and research agenda
”,
Journal of Service Management
, Vol. 
31
No. 
2
, pp. 
203
-
225
, doi: .
Voorhees
,
C.M.
,
Brady
,
M.K.
,
Calantone
,
R.
and
Ramirez
,
E.
(
2016
), “
Discriminant validity testing in marketing: an analysis, causes for concern, and proposed remedies
”,
Journal of the Academy of Marketing Science
, Vol. 
44
No. 
1
, pp. 
119
-
134
, doi: .
Wagner
,
K.
,
Nimmermann
,
F.
and
Schramm-Klein
,
H.
(
2019
), “
Is it human? The role of anthropomorphism as a driver for the successful acceptance of digital voice assistants
”,
Proceedings of the 52nd Hawaii International Conference on System Sciences
, pp. 
1
-
10
.
Wang
,
Y.-Y.
and
Wang
,
Y.-S.
(
2022
), “
Development and validation of an artificial intelligence anxiety scale: an initial application in predicting motivated learning behavior
”,
Interactive Learning Environments
, Vol. 
30
No. 
4
, pp. 
619
-
634
, doi: .
Wang
,
C.
,
Li
,
X.
,
Liang
,
Z.
,
Sheng
,
Y.
,
Zhao
,
Q.
and
Chen
,
S.
(
2024
), “
The roles of social perception and AI anxiety in individuals' attitudes toward ChatGPT in education
”,
International Journal of Human-Computer Interaction
, Vol. 
41
No. 
9
, pp. 
1
-
18
, doi: .
Waytz
,
A.
and
Epley
,
N.
(
2012
), “
Social connection enables dehumanization
”,
Journal of Experimental Social Psychology
, Vol. 
48
No. 
1
, pp. 
70
-
76
, doi: .
Waytz
,
A.
,
Morewedge
,
C.K.
,
Epley
,
N.
,
Monteleone
,
G.
,
Gao
,
J.H.
and
Cacioppo
,
J.T.
(
2010
), “
Making sense by making sentient: effectance motivation increases anthropomorphism
”,
Journal of Personality and Social Psychology
, Vol. 
99
No. 
3
, pp. 
410
-
435
, doi: .
Waytz
,
A.
,
Heafner
,
J.
and
Epley
,
N.
(
2014
), “
The mind in the machine: anthropomorphism increases trust in an autonomous vehicle
”,
Journal of Experimental Social Psychology
, Vol. 
52
, pp. 
113
-
117
, doi: .
Wei
,
C.Z.
,
Kim
,
Y.H.
and
Kuzminykh
,
A.
(
2023
), “
The bot on speaking terms: the effects of conversation architecture on perceptions of conversational agents
”,
Proceedings of the 5th International Conference on Conversational User Interfaces
, pp. 
1
-
16
, doi: .
Weitz
,
K.
,
Schiller
,
D.
,
Schlagowski
,
R.
,
Huber
,
T.
and
André
,
E.
(
2021
), “
Let me explain!: exploring the potential of virtual agents in explainable AI interaction design
”,
Journal on Multimodal User Interfaces
, Vol. 
15
No. 
2
, pp. 
87
-
98
, doi: .
Wetzel
,
C.G.
and
Insko
,
C.A.
(
1982
), “
The similarity-attraction relationship: is there an ideal one?
”,
Journal of Experimental Social Psychology
, Vol. 
18
No. 
3
, pp. 
253
-
276
, doi: .
Wetzels
,
M.
,
Grewal
,
D.
and
Wetzels
,
R.
(
2023
), “
A systematic and visual overview of 25 years of the Journal of Service Research: the journey continues
”,
Journal of Service Research
, Vol. 
26
No. 
4
, pp. 
479
-
492
, doi: .
Whang
,
C.
and
Im
,
H.
(
2021
), “
I like your suggestion! The role of humanlikeness and parasocial relationship on the website versus voice shopper's perception of recommendations
”,
Psychology and Marketing
, Vol. 
38
No. 
4
, pp. 
581
-
595
, doi: .
White
,
R.W.
(
1959
), “
Motivation reconsidered: the concept of competence
”,
Psychological Review
, Vol. 
66
No. 
5
, pp. 
297
-
331
, doi: .
Wünderlich
,
N.V.
,
Blut
,
M.
and
Brock
,
C.
(
2024
), “
Enhancing corporate brands through service robots: the impact of anthropomorphic design metaphors on corporate brand perceptions
”,
Journal of Product Innovation Management
, Vol. 
41
No. 
5
, pp. 
1022
-
1046
, doi: .
Xie
,
Y.
,
Qu
,
J.
,
Zhang
,
Y.
,
Zhou
,
R.
and
Chan
,
A.H.S.
(
2023
), “
Speaking, fast or slow: how conversational agents' rate of speech influences user experience
”,
Universal Access in the Information Society
, Vol. 
23
No. 
4
, pp. 
1
-
10
, doi: .
Xu
,
K.
(
2020
), “
Language, modality, and mobile media use experiences: social responses to smartphone cues in a task-oriented context
”,
Telematics and Informatics
, Vol. 
48
, pp. 1-13, 101344, doi: .
Yuan
,
C.L.
,
Kim
,
J.
and
Kim
,
S.J.
(
2016
), “
Parasocial relationship effects on customer equity in the social media context
”,
Journal of Business Research
, Vol. 
69
No. 
9
, pp. 
3795
-
3803
, doi: .
Yuan
,
C.
,
Zhang
,
C.
and
Wang
,
S.
(
2022
), “
Social anxiety as a moderator in consumer willingness to accept AI assistants based on utilitarian and hedonic values
”,
Journal of Retailing and Consumer Services
, Vol. 
65
, pp. 1-11, 102878, doi: .
Yusif
,
S.
,
Soar
,
J.
and
Hafeez-Baig
,
A.
(
2016
), “
Older people, assistive technologies, and the barriers to adoption: a systematic review
”,
International Journal of Medical Informatics
, Vol. 
94
, pp. 
112
-
116
, doi: .
Zoorob
,
D.
,
Hasbini
,
Y.
,
Chen
,
K.
,
Wangia-Anderson
,
V.
,
Moussa
,
H.
,
Miller
,
B.
and
Brobst
,
D.
(
2022
), “
Ageism in healthcare technology: the older patients’ aspirations for improved online accessibility
”,
JAMIA Open
, Vol. 
5
No. 
3
, pp. 1-6, ooac061, doi: .
Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at Link to the terms of the CC BY 4.0 licence.

or Create an Account

Close Modal
Close Modal