Both the literature on advice-taking from humans and the literature on advice-taking from algorithms suggest that providing advice rationales—information explaining how an adviser arrived at a recommendation—increase advice-taking. We examine how accountability in the form of managers having to justify their judgments and decisions to superiors moderates this effect and whether the moderating effect is dependent on the adviser being a human or al algorithm.
We use an experiment, in which we manipulate the type of adviser (human vs algorithm), the presence of an advice rationale (present vs absent) accountability (low vs high), and measure the extent of advice-taking.
For human advisers, we find that receiving an advice rationale increases advice-taking more under high accountability than under low accountability. However, a significant three-way interaction suggests that this effect is less so when the adviser is an algorithm: the positive effect of an advice rationale on advice-taking from an algorithm is not moderated by accountability but consistently strong.
Our results emphasize the importance of providing advice rationales, especially when managers are operating under high accountability. Firms may consider the cost–benefit trade-off of providing advice rationales in the case of human advisers. In settings of low accountability, the benefits of providing an immediate advice rationale for the advisee may not outweigh the costs for the adviser. In the case of algorithmic advisers, providing an advice rationale is essential, regardless of the accountability under which a potential advice-taker operates.
We advance both the literature on advice-taking from humans and advice-taking from algorithms by showing how the effects of advice rationales are affected by the level of accountability, which is ubiquitous in the managerial context. Given that generating advice rationales can be costly and the potential negative effects of accountability, these insights are also of major importance to practitioners who need to consider the costs and benefits of organizational structures.
1. Introduction
In many managerial tasks, managers have to rely on advice as an input to their judgment and decision-making (Frishammar 2003; Bailey et al., 2023; Criado-Perez et al., 2024). Traditionally, managers received advice from other humans but nowadays, advice increasingly originates from algorithms (Müller et al., 2018; Trunk et al., 2020; Fehrenbacher et al., 2023; Daschner and Obermaier, 2022; Ning et al., 2024). For example, firms in practice started to implement algorithmic advice solutions in forecasting and augmented management (KPMG, 2024; Mayer et al., 2023). Extant literature argues that this is due to the potential positive effects that integrating algorithmic advice may entail for managerial judgment and decision-making (Blattberg and Hoch, 1990; Hoch and Schkade, 1996; Mishra et al., 2022; Biswas et al., 2023; Neiroukh et al., 2024; Oppioli et al., 2023).
One might argue that for fully rational individuals, only the quality of advice should affect advice-taking. However, research shows there are various other factors that matter when individuals decide whether they take advice from other individuals or not (e.g. Birnbaum and Stegner, 1979; Sniezek and Van Swol, 2001; Gino, 2008). Moreover, especially when advice originates from algorithms instead of humans, research shows that various (sometimes irrational) factors are at play (Filiz et al., 2021), a central issue being algorithm aversion (Baines et al., 2024), understood as the aversion to use advice because it comes from an algorithm. Some argue that algorithm aversion can be overcome by increasing the transparency about the algorithm (Park and Yoon, 2024; Ning et al., 2024), and advice rationales —information explaining how an adviser arrived at a recommendation—are an extensively discussed approach to that (e.g. Alam and Mueller, 2021; Zhang et al., 2022). The role of such transparency is, however, up to now not fully clear and many organizational contingencies that may hamper or increase the relevance of advice rationales are yet to be explored (Adadi and Berrada, 2018; Schmidt et al., 2020; Shin, 2021; Ning et al., 2024). In this paper, we add to this ongoing discussion and compare advice-taking from human advisers with advice-taking from algorithmic advisers in order to gain insights in the relevance of advice rationales for usage of advice. Specifically, we develop and test theory predicting that advice-taking from human- and algorithmic advisers differs in the extent to which the effect of an advice rationale accompanying the advice is amplified by the level of accountability managers’ face.
In the broad literature on advice-taking from human advisers (e.g. Harvey and Fischer, 1997), it has been shown that the presence of an advice rationale can increase managers' perceptions of the adviser’s credibility and transparency, thereby enhancing advice-taking (e.g. Bonaccio and Dalal, 2006; Tzioti et al., 2014). The reason is that individuals often use advice rationales to infer advice quality and assess the adviser’s expertise (Yates et al., 1996; Yaniv, 2004). Quite similarly, existing studies on algorithmic advice have focused on factors like the perceived quality of the algorithm’s outputs (Alexander et al., 2018; Mahmud et al., 2022) and how it increases trust in the algorithm and actual advice-taking (Glikson and Williams Woolley, 2020; Schmidt et al., 2020). While there is a vast and nuanced literature on algorithm aversion and algorithm appreciation, a general finding is that providing an advice rationale with algorithmic advice (sometimes referred to as “explainable AI”) reduces the perception of the algorithm as a “black box”, and thereby increases advice-taking from an algorithm (e.g. Ye and Johnson, 1995; Blanco-Justicia et al., 2020; Baines et al., 2024).
How accountability interacts with the presence of an advice rationale to influence managers' advice-taking from both human- and algorithmic advisers is an open question that has not been addressed so far. This question is important, as accountability in the form of having managers justify their judgments and decisions to superiors is ubiquitous in the managerial context (e.g. Libby et al., 2004; Maske et al., 2021). However, the magnitude of accountability differs within and between industries, firms and hierarchical levels (Andersen, 2000; Dane et al., 2012; Kerler et al., 2014 or Butler and Ghosh, 2015). Prior research shows that accountability in general tends to alter how individuals react to inputs for their decision-making (e.g. Lerner and Tetlock, 1999; Fehrenbacher et al., 2020). As advice is a central mechanism by which information is transmitted in organizations, often forming the central input and decision basis for various judgments and decisions (Bonaccio and Dalal, 2006; Kämmer et al., 2023), understanding how advice-taking is affected by accountability is of major importance. As we will theorize below, accountability is likely to influence the extent to which managers value an advice rationale that comes with the advice.
We first argue that in case of a human adviser, even when accountability is low and a manager is not formally required to justify a decision, the manager may still expect to be asked to justify a decision later. In addition to relying on his/her own reasoning, an advice rationale can help a manager justify why he/she relied (did not rely) on a piece of advice and made a judgment or decision in a certain way. Thus, receiving an advice rationale directly with the advice likely increases the extent to which the manager uses advice, even when accountability is low. When accountability is high, however, the advice rationale becomes even more important and its effect on usage of the advice becomes stronger. The reason is that under high accountability, it is highly probable that a managers will need to justify his/her decision. While the manager could approach the human adviser for an advice rationale later on when needed, an immediate advice rationale is key in that case, also minimizing risk (e.g. that the adviser’s rationale is of low quality and insufficient to help the manager justify his/her decision) and potential effort for the manager to obtain the advice rationale. Thus, we predict that in case of a human adviser, receiving an advice rationale immediately with the advice increases managers’ advice-taking more when managers’ accountability is high than when it is low.
We predict, however, that this mechanism applies differently to human-than to algorithmic advisers. When the adviser is an algorithm, we expect that the positive effect of an advice rationale on advice-taking is less affected by the level of accountability. Since algorithms cannot provide on-demand advice rationales later on, managers always value an immediate rationale to assess the algorithm’s credibility and reduce their perception of it being a “black box”. Therefore, we predict that the extent to which accountability amplifies the positive effect of an advice rationale on managers’ advice-taking is reduced for algorithmic advisers compared to human advisers. In other words, our predicted three-way interaction entails that in case of algorithmic advice, an advice rationale substantially increases advice-taking even under low levels of accountability.
To test our hypotheses, we conduct an experimental study using a forecasting setting (e.g. Fildes et al., 2019; Aschauer et al., 2024). Participants assume the role of management accountants tasked with producing a forecast. Planning and forecasting are typical components of the jobs of lower-level and middle-managers, to whom we aim to generalize our findings. To provide a clean test of our theory and simplify the task, we inform participants that they have to rely on input from advisers without access to raw data. To this end, they receive advice from two sources. As a baseline (constant across conditions), there is a human adviser who has long performed similar tasks. Between-subjects, we first vary whether the second adviser is a human or an algorithm. For the second adviser, we then manipulate (also between-subjects) the presence of an advice rationale (present vs absent) in the form of the adviser explaining how the advice was generated (or not). Finally, we also vary accountability between subjects (low vs high) by informing participants that they are (not) being held accountable for their reported forecast. Our dependent variable is the degree of advice-taking from the second adviser.
Our results are consistent with our predictions. For human advisers, we find that receiving an advice rationale increases managers' advice-taking more under high accountability than under low accountability. However, a significant three-way interaction suggests that this effect is less so when the adviser is an algorithm: The positive effect of an advice rationale on advice-taking from an algorithm is not affected by accountability but constantly high.
In light of the growing literature on advice-taking from both human and algorithmic sources (e.g. Kämmer et al., 2023; Mahmud et al., 2022), the findings of our paper have important implications for both research and practice. While prior research has extensively studied factors influencing advice-taking from humans (e.g. Bailey et al., 2023; Landis et al., 2022) and algorithms (e.g. Glikson and Williams Woolley, 2020; Mahmud et al., 2022), our study integrates these two distinct streams and examines the moderating role of accountability on the importance of advice rationales for managers’ advice-taking. Our findings thereby first add to the stream of literature on managers’ advice-taking from other humans, which already suggests that advice rationales increase advice-taking (e.g. Bonaccio and Dalal, 2006; Kämmer et al., 2023). We show that accountability amplifies this effect such that an advice rationale is especially important for human advice when the advice-taker operates under high accountability. Thus, taking into account that producing an advice rationale is costly for the adviser, advice-givers in practice should consider the level of accountability under which a potential advice-taker operates when weighting the costs and benefits of generating and providing an unsolicited advice rationale.
Second, our findings contribute to the literature on usage of algorithmic advice. It has been shown that advice rationales in the form of transparency and explanations (e.g. “explainable AI”) can increase trust and advice-taking from algorithms (e.g. Woodcock et al., 2021; Wang and Yin, 2021; Park and Yoon, 2024; Ning et al., 2024). Our research is, to the best of our knowledge, the first to investigate whether accountability moderates such effects. Accountability varies and is a crucial element of the organizational setting in which users of algorithms are embedded. By developing theory and providing evidence that managers’ advice-taking from algorithms is highly dependent on the presence of an advice rationale (regardless of the level of accountability), our findings suggest that organizations should always complement algorithmic advice with an advice rationale, as this is important in order to induce managers to use the advice, even when they operate under low accountability.
Finally, our study also contributes to the broader literature on accountability, which has a diverging view on its consequences (e.g. Aleksovska et al., 2019; Lerner and Tetlock, 1999). Importantly, this literature shows that accountability may also entail negative consequences such as reductions in organizational efficiency and effectiveness (Halachmi, 2002; Koppell, 2005). Our findings show that when organizations want to discourage managers from relying on intransparent algorithmic advice, it may not be necessary to hold them accountable.
2. Hypotheses development
2.1 Background
The theoretical background of our study consists of extant research on advice-taking, covering both advice from humans and advice from algorithms. We approach these broad strands of literature with a focus on the factors relevant to our specific research question, in particular highlighting the mechanisms that influence advice-taking from these sources that may be affected by accountability. Importantly, when reviewing the literature, we note that both the literature on advice-taking from humans and from algorithms have often focused on advice-taking in estimation and forecasting settings such as the one we use to test our theory (e.g. Leong and Zaki, 2018; Mayer et al., 2023; Aschauer et al., 2024).
Advice-taking from humans is a quite well-researched domain (e.g. Bailey et al., 2023). This stream has investigated extensively how various organizational and contextual variables affect advice-taking. Examples are how much effort it takes to solicit advice (Gino, 2008) or the social connection advice-seekers have to the advice giver (Landis et al., 2022). Important to us is that a solid finding in this literature is that when the advice giver is perceived as more credible, advice-taking increases (e.g. Birnbaum and Stegner, 1979; Bonaccio and Dalal, 2006). This literature further implies that individuals use advice rationales, i.e. information about how an adviser arrived at a conclusion, as a way to infer advice credibility and quality (Yates et al., 1996). Hence, it has been found that advice rationales increase advice-taking (e.g. Ribeiro et al., 2019; Tzioti et al., 2014). It is, however, up to now not clear how such effects are affected by the level of accountability under which the individual receiving advice operates (Kämmer et al., 2023).
In general, accountability refers to “the implicit or explicit expectation that one may be called on to justify one’s beliefs, feelings, and actions to others” (Lerner and Tetlock, 1999, p. 255). In the business context, accountability varies substantially between companies and functions within companies (Andersen, 2000; Merchant and Otley, 2007; Dane et al., 2012; Graham et al., 2014). There is a vast literature on the diverse effects of holding decision-makers accountable in various settings (e.g. Pan and Patel, 2022; Aleksovska et al., 2019; Li et al., 2022; Pérez-Durán and Grimmelikhuijsen, 2024). While this literature is, to the best of our knowledge, largely silent on accountability’s effects on advice-taking from other individuals, a solid finding important to our research is that holding individuals accountable generally increases perceived justification pressure and thus leads to more effortful decision-making (e.g. Dalla Via et al., 2019; Fehrenbacher et al., 2020). In our hypothesis development below, we build on this finding.
In contrast to advice-taking from humans, the literature on advice-taking from algorithms focuses on slightly different factors that may be important. Early studies on advice from algorithmic or statistics-based sources (e.g. Meehl, 1954) and research on reliance on automation (e.g. Lee and See, 2004) already suggest that trust and reliance on algorithms are highly conditional on a range of situational factors. Recently, the literature on individuals’ reliance on algorithms has started to grow rapidly (Mayer et al., 2023; Oppioli et al., 2023; Baines et al., 2024). Still, there are open questions as to when algorithmic advice is used or ignored (e.g. Logg et al., 2019; Burton et al., 2020; Mahmud et al., 2022) and how usage of algorithmic advice can be increased, where appropriate. Importantly, recent studies on algorithm usage highlight that the situation (Saragih and Morrison, 2022) and the organizational context (Filiz et al., 2023) matters. Analogous to the literature on advice-taking from humans, one such organizational context factor that has not received major scholarly attention in that regard is accountability (Aleksovska et al., 2019).
Moreover, many existing studies on advice-taking from algorithms investigate issues like the algorithm’s perceived performance (i.e. the perceived quality of its advice, e.g. Alexander et al., 2018; Mahmud et al., 2022) and trust in the algorithm (Glikson and Williams Woolley, 2020; Schmidt et al., 2020; Alam and Mueller, 2021). Quite common to this literature is the focus on users’ endogenous perceptions and attitudes toward the algorithm, e.g. showing that if the user has trust in algorithms or perceives the algorithm to be performing well, the user will use advice from the algorithm (e.g. Daschner and Obermaier, 2022; Saragih and Morrison, 2022), instead of examining factors that an organization can exogenously influence to affect advice-taking from the algorithm. Important to us is, however, that analogously to the findings on advice-taking from humans, this literature shows that individuals consider advice rationales in the form of explanations on how algorithmic advice came about. That is, the literature implies that individuals’ advice-taking from algorithms increases when the process of producing the advice becomes more transparent and the algorithm appears less to be a “black box” (Ye and Johnson, 1995; Du and Ruhe, 2009; Blanco-Justicia et al., 2020; Alam and Mueller, 2021; Woodcock et al., 2021; Wang and Yin, 2021). This finding is shown to be robust and directionally consistent, irrespective of individuals’ perceptions of trustworthiness (and factors alike) of an algorithm. However, as indicated above and further in line with the literature on advice-taking from humans, less is known about whether the effect of advice rationales on individuals’ advice-taking from algorithms is affected by the level of accountability under which the advice taker operates.
2.2 How accountability moderates the effect of an advice rationale on managers’ advice-taking from human advisers
We first consider the effects in case of a human adviser and develop theory to predict that receiving an advice rationale with the advice increases managers’ advice-taking more when accountability is high than when it is low.
As indicated above, prior work posits that managers have an aversion to intransparency in terms of advice and finds that the presence of an advice rationale increases managers’ advice-taking from human advisers (e.g. Tzioti et al., 2014). We expect to replicate these findings and commence our hypotheses development with this point. Specifically, we argue that an advice rationale is important to managers and increases their advice-taking even under low accountability. The reason is that while accountability varies in strength, in most organizational settings, accountability is never fully absent. Stated differently, while accountability is a continuum in practice, there is likely always a certain level of accountability: Managers must always expect to potentially be asked to justify a decision, even when such an accountability mechanism is not formally implemented. Hence, having an advice rationale is always beneficial for managers when considering taking advice in order to be able to potentially justify a decision in a proper way. This is because an advice rationale not only constitutes additional information on the given advice but can also be referred to when justifying why the advice was (not) taken and how a respective judgment or decision was made.
When accountability is high, having an advice rationale becomes more important when considering taking advice, as it becomes more likely (in some cases quasi-certain) that one is called to justify the decision. There can be considerable time in between the point in time when the manager receives advice, when she/he makes the decision based on the advice received, and when she/he has to justify the decision (Bussone et al., 2015). Of course, a manager can still approach a human adviser for an advice rationale at a later point in time and request an “on-demand advice rationale” (Yaniv, 2004; Tzioti et al., 2014). However, this is costly, as it requires effort (Gino, 2008). It is also somewhat risky, as the advice rationale may turn out to be weak or even insufficient to help the manager justify his/her decision. Thus, while an advice rationale from a human adviser can always be obtained on demand, even ex post, receiving the advice rationale immediately with the advice is beneficial and likely increases advice-taking, especially under high accountability, where it is more likely that the advice rationale might be helpful in order to justify one’s decision [1].
Additionally, when managers receive an advice rationale, they can better assess the adviser’s expertise and credibility (Bonaccio and Dalal, 2006), which is especially important when managers are under high accountability and anticipate having to justify their decisions. A human advisers’ advice rationale can also help build up trust between the adviser and the manager (Meshi et al., 2012; Dalal and Bonaccio, 2010; Ribeiro et al., 2019), which is again especially important under high accountability. Taken together, the preceding discussion leads to our first formal prediction:
For a human adviser, receiving an advice rationale with the advice increases managers’ advice-taking more when managers’ accountability is high than when it is low.
2.3 The case of an algorithmic adviser
We next develop theory to predict that when the adviser is an algorithm, the moderating effect of accountability on the positive impact of the advice rationale on advice-taking is reduced. Unlike human advisers, algorithmic advisers typically cannot provide a spontaneous, personalized advice rationale upon request and this is seen as a major drawback for usage (Chen et al., 2024; Ning et al., 2024). If an advice rationale is not provided right away, managers have no means to obtain it when needed. That is, typically, in practice, an algorithm’s functionality to provide rationales is either installed or not (Gohel et al., 2021). This inherent lack of explainability places algorithms at an informational disadvantage in terms of transparency compared to human advisers, who can readily provide an advice rationale upon request and also ex post (Gönül et al., 2012; Goodwin et al., 2013). Additionally, as an algorithmic advice without an advice rationale is likely perceived as a “black box”, there is less trust, making the advice rationale crucial for assessing the credibility and quality of the advice and eventual advice-taking (Park and Yoon, 2024; Ning et al., 2024).
With human advisers, managers can request an advice rationale when needed: Under high accountability, they may value when advice is directly accompanied by a rationale as argued above, while under low accountability, they might care less, as they may still obtain a rationale when it becomes necessary. However, with algorithmic advisers, the inability to obtain an on-demand advice rationale means that managers cannot rely on future opportunities to gather the necessary rationale. Stated differently, in case managers might believe there is just a slight possibility that they would need to justify a judgment or decision and an advice rationale might be helpful information for that purpose, they always value it. As a result, the positive effect of an advice rationale on advice-taking is less affected by the level of accountability because managers always value an immediate advice rationale. That means the positive effect of an advice rationale is consistently high. High accountability does not amplify this effect as much as with human advisers, respectively low accountability does not substantially reduce the relevance of an advice rationale. We formally predict this as a three-way interaction in Hypothesis 2:
The extent to which receiving an advice rationale increases managers’ advice-taking more under high accountability than under low accountability is reduced when the adviser is an algorithm compared to when the adviser is human.
3. Method
3.1 Experimental design and task overview
We use an experimental design to test our predictions as it allows for a highly internally valid test of theory. Specifically, an experiment allows us to keep control over various factors that may vary and be confounded in practice, such as the content and structure of an advice rationale.
The setting of the experiment reflects the scenario outlined in the background section and represents a typical business task (Mayer et al., 2023; Fildes et al., 2019; Aschauer et al., 2024). Specifically, participants take over the role of a management accountant tasked with producing a forecast. A forecasting task has many features that make it well-suited to test our theory. First, forecast errors may entail severe consequences, enabling us to manipulate accountability in a strong way. Second, as Aschauer et al. (2024, p. 187) note, “managers in these settings are often supported by human advisors or algorithmic decision aids”, making our experimental scenario realistic and engaging. Third, as forecasting tasks are frequently used in the related literature on advice-taking (e.g. Leong and Zaki, 2018; Dietvorst et al., 2018; Logg et al., 2019; Jung and Seiter, 2021), choosing such a task makes results more comparable across studies. Finally, as we abstracted away from concrete numbers in our experimental task (as outlined below), we believe our task ultimately only requires that participants understand they are making a decision on behalf of others (and may be held accountable for that) while having to rely on information from someone else.
Participants in our scenario have no access to raw data but receive advice in form of forecasts from two advisers and have to make a decision on how to use the advice in their decision of what forecast to report to their superior. Participants are free in how to use the input from the two advisers. We refrain from giving specific numbers (e.g. raw data or numbers in the advice) to avoid effects of anchoring and confounding the advice sources with the relative magnitude of the advice (as we will further outline below). Our dependent variable is the degree to which participants indicate they would rely on the second advice.
As a baseline, the first advice is from a human across all conditions. The second advice is either from a second human or from an algorithm. For the second advice (from the other human or the algorithm), participants either receive an advice rationale on how the advice came about or not. Finally, participants were either informed that they are held accountable or not. Overall, we use a 2 × 2 × 2 between-subject design. We describe the specific manipulations in the following.
3.2 Type of adviser manipulation
The first manipulated variable is the type of adviser, and we varied it on two levels (human vs algorithm) between-subjects. Across conditions, participants were informed that two estimates are available as input to support their decision-making. The two inputs come from different sources and participants have full discretion in how to use them (e.g. rely on only one advice, rely on both equally, etc.). As a baseline, the first input is advice from a human forecaster (“Mr Smith”), who has been working in sales forecasting for a long time. This is constant across all conditions.
As a between-subjects manipulation, the second input is either from another human (“Mr Jones”), also from the sales department, who is screening the market for the specific product at hand, or from an algorithm, specifically, an Artificial Intelligence tool, which is screening the market for the specific product at hand. That is, everything was kept constant except that it was explicitly communicated that this second adviser is a human or an algorithm.
3.3 Advice rationale manipulation
The second manipulated variable is the advice rationale, and it is also varied on two levels (absent vs present) between-subjects. Specifically, for the manipulated adviser described above, i.e. the second human or the algorithm, an advice rationale was added (advice rationale present) or not (advice rationale absent), which explains how the adviser’s forecast came about.
The rationale starts with a general explanation, in the sense that an abstract forecasting model was outlined, elaborating how forecasts in general are affected by elements of the situation. This general explanation of how a forecast is produced was then supplemented by information on features of the current situation for which the advice was generated. For instance, it was stated that the adviser’s forecast takes into account cyclical effects or activities of the customers. For each feature, the impact on the advice was also pointed out.
Most importantly, the advice rationale manipulation was exactly the same, regardless of whether it described the advice of the second human or the algorithm. As an instruction check and to foster participants’ attention, participants in the advice rationale present condition were requested to summarize the advice rationale in their own words.
3.4 Accountability manipulation
Finally, the third manipulated variable is accountability, and it is also varied on two levels (low vs high) between-subjects. In conceptual and empirical research on accountability, the accountability concept refers to whether or not an actor has to bear the accountability and can expect to be asked to justify his/her decisions (Weigold and Schlenker, 1991; Lerner and Tetlock, 1999; Rausch and Brauneis, 2015; Aleksovska et al., 2019). Accountability was thus manipulated quite straight-forward and in line with existing research, following Johnson and Kaplan (1991) in that participants were either told that they are accountable to their superior for their report and need to justify their decision (accountability present) or not (accountability absent). We thereby do not explicitly distinguish between outcome- and process transparency (e.g. Dalla Via et al., 2019). Furthermore, in line with established research (Tetlock, 1985; Simonson and Nye, 1992; Harvey and Fischer, 1997), we strengthened this manipulation by emphasizing the relevance of the reported forecast when accountability was present. The manipulation was presented twice: First, when the role description was presented and a second time before the actual decision was made. Instruction checks assured participants’ understanding.
3.5 Dependent variable
The dependent variable is participants’ advice-taking from the second (human or algorithmic) adviser when deciding what to report to the superior. As already outlined, we refrained from providing specific numeric advice, among others, to avoid confounding the adviser with the relative magnitude of the forecast. As such, we measure advice-taking on an 11-point Likert scale with the two sources of advice as endpoints. Recall that across conditions, the baseline input came from Mr. Smith, who has been working in sales forecasting for a long time. Full reliance on the advice of Mr. Smith was thus the lower endpoint of the scale. The other endpoint represented full reliance on the forecast of the other human adviser (human adviser condition) or the algorithm (algorithm adviser condition), depending on the type of adviser condition. Thus, higher values on the dependent variable represent more advice-taking from the adviser we used for manipulating our variable of interest (i.e. type of adviser). Note that as the baseline input (i.e. “Mr. Smith”) was held constant across conditions, our dependent variable is structurally equivalent to just asking participants to what extent they would rely on the advice from the second adviser. At the same time, this operationalization of advice-taking avoids ceiling effects [2].
3.6 Participants and procedures
Participants were recruited via Amazon MTurk [3]. To ensure high quality responses, we require participants to have a “master status”, an approval rate of at least 95%, and to be located in the United States in order to participate. Necessary technical steps were taken to ensure that each participant could only participate once. Overall, 248 persons participated, 46.3% identified as female. On average, participants were 44.8 years of age and had 22 years of working experience. This demographic information suggests that participants are suitable proxies for lower-level and middle-managers, to whom we aim to generalize our findings. A randomization check, implemented by regressing membership in the experimental conditions on demographics using logit models, indicated no demographic differences between the experimental conditions (all p-values >0.10).
Participants were informed about the academic nature of the study and had to explicitly give their consent to participate. After completing the study, participants received a fixed compensation of 3 USD as a participation fee and were debriefed about the study’s purpose. The experiment reported in this paper received Institutional Review Board (IRB) approval for research ethics.
4. Results
4.1 Descriptive statistics
Recall that in H1, we predict that for a human adviser, receiving an advice rationale with the advice increases managers’ advice-taking more when managers’ accountability is high than when it is low. In H2, we further predict that the extent to which receiving an advice rationale increases managers’ advice-taking more under high accountability than under low accountability is reduced when the adviser is an algorithm compared to when the adviser is human.
Table 1 presents descriptives for the dependent variable, advice-taking, in the eight experimental cells that result from our 2 × 2 × 2 between-subjects experiment.
Descriptive statistics for advice-taking by experimental condition
| Low accountability | High accountability | ||
|---|---|---|---|
| Type of Adviser: Human | Advice Rationale Absent | Mean = 5.56 | Mean = 5.64 |
| St.D = 0.97 | St.D = 0.99 | ||
| N = 33 | N = 36 | ||
| Advice Rationale Present | Mean = 6.77 | Mean = 7.86 | |
| St.D = 2.13 | St.D = 1.96 | ||
| N = 31 | N = 29 | ||
| Type of Adviser: Algorithm | Advice Rationale Absent | Mean = 5.67 | Mean = 6.13 |
| St.D = 1.43 | St.D = 1.70 | ||
| N = 33 | N = 30 | ||
| Advice Rationale Present | Mean = 7.52 | Mean = 7.76 | |
| St.D = 1.99 | St.D = 2.12 | ||
| N = 27 | N = 29 |
| Low accountability | High accountability | ||
|---|---|---|---|
| Type of Adviser: Human | Advice Rationale Absent | Mean = 5.56 | Mean = 5.64 |
| St.D = 0.97 | St.D = 0.99 | ||
| N = 33 | N = 36 | ||
| Advice Rationale Present | Mean = 6.77 | Mean = 7.86 | |
| St.D = 2.13 | St.D = 1.96 | ||
| N = 31 | N = 29 | ||
| Type of Adviser: Algorithm | Advice Rationale Absent | Mean = 5.67 | Mean = 6.13 |
| St.D = 1.43 | St.D = 1.70 | ||
| N = 33 | N = 30 | ||
| Advice Rationale Present | Mean = 7.52 | Mean = 7.76 | |
| St.D = 1.99 | St.D = 2.12 | ||
| N = 27 | N = 29 |
Note(s): Every cell displays the mean, standard deviation and number of observations in the corresponding condition
Our dependent variable, advice-taking is measured on an 11-point Likert scale; higher values on the variable represent more advice-taking from the human/algorithmic adviser (depending on the type of adviser condition)
Type of adviser is manipulated between subjects on two levels: human/algorithm. In the human adviser condition, the advice comes from a human adviser. In the algorithmic adviser condition, the advice comes from an artificial intelligence tool
Advice rationale is manipulated between subjects on two levels: absent/present. In the advice rationale present (absent) condition, the advice from the human/algorithmic adviser [depending on the type of adviser condition] is (not) accompanied by an advice rationale
Accountability is manipulated between subjects on two levels: low/high. In the accountability high (low) condition, participants are informed that they are (not) being held accountable for their reported forecast
Source(s): Authors’ own work
The means are also visualized in Figures 1 and 2. Figure 1 presents comparisons across the two levels of accountability and type of adviser. Panel A of Figure 2 shows the means for the human adviser condition and thus directly relates to H1. Panel B of Figure 2, in turn, shows the means for the algorithm adviser condition. Comparing the two panels of Figure 2 already indicates a different pattern of means regarding the moderating effect of accountability in case of a human-vs an algorithmic adviser. Table 1 and Panel A of Figure 2 indicate that in case of a human adviser, when a manager has low accountability, the advice rationale increases advice-taking (5.56 vs 6.77). This positive effect of advice rationale appears more pronounced under high accountability (5.64 vs 7.86). These descriptive statistics are consistent with H1.
The figure displays four vertical bar charts arranged in two rows and two columns. The vertical axis of each graph ranges from 5.5 to 6.9 in increments of 0.2. The details of the graph is as follows: The first graph on the top left is titled “Panel A: Effects of accountability on advice-taking when the adviser is human”. The horizontal axis is labeled with two bars, “Low Accountability” and “High Accountability”. The data from the graph is as follows: Low Accountability: 6.18. High Accountability: 6.64. The second graph on the top right is titled “Panel B: Effects of accountability on advice-taking when the adviser is an algorithm”. The horizontal axis is labeled with two bars, “Low Accountability” and “High Accountability”. The data from the graph is as follows: Low Accountability: 6.50. High Accountability: 6.95. The third graph on the bottom left is titled “Panel C: Effects of human versus algorithmic adviser on advice-taking when accountability is low”. The horizontal axis is labeled with two bars, “Human” and “Algorithm”. The data from the graph is as follows: Human: 6.16. Algorithm: 6.50. The fourth graph on the bottom right is titled “Panel D: Effects of human versus algorithmic adviser on advice-taking when accountability is high”. The horizontal axis is labeled with two bars, “Human” and “Algorithm”. The data from the graph is as follows: Human: 6.61. Algorithm: 6.94. Note: All the numerical values are approximated.Visualization of accountability and type of adviser effects on advice-taking. Notes: Our dependent variable, advice-taking, is measured on an 11-point Likert scale; higher values on the variable represent more advice-taking from the human/algorithmic adviser (depending on the type of adviser condition). Type of adviser is manipulated between subjects on two levels: human/algorithm. In the human adviser condition, the advice comes from a human adviser. In the algorithmic adviser condition, the advice comes from an artificial intelligence tool. Accountability is manipulated between subjects on two levels: low/high. In the accountability high (low) condition, participants are informed that they are (not) being held accountable for their reported forecast. Source: Authors’ own work
The figure displays four vertical bar charts arranged in two rows and two columns. The vertical axis of each graph ranges from 5.5 to 6.9 in increments of 0.2. The details of the graph is as follows: The first graph on the top left is titled “Panel A: Effects of accountability on advice-taking when the adviser is human”. The horizontal axis is labeled with two bars, “Low Accountability” and “High Accountability”. The data from the graph is as follows: Low Accountability: 6.18. High Accountability: 6.64. The second graph on the top right is titled “Panel B: Effects of accountability on advice-taking when the adviser is an algorithm”. The horizontal axis is labeled with two bars, “Low Accountability” and “High Accountability”. The data from the graph is as follows: Low Accountability: 6.50. High Accountability: 6.95. The third graph on the bottom left is titled “Panel C: Effects of human versus algorithmic adviser on advice-taking when accountability is low”. The horizontal axis is labeled with two bars, “Human” and “Algorithm”. The data from the graph is as follows: Human: 6.16. Algorithm: 6.50. The fourth graph on the bottom right is titled “Panel D: Effects of human versus algorithmic adviser on advice-taking when accountability is high”. The horizontal axis is labeled with two bars, “Human” and “Algorithm”. The data from the graph is as follows: Human: 6.61. Algorithm: 6.94. Note: All the numerical values are approximated.Visualization of accountability and type of adviser effects on advice-taking. Notes: Our dependent variable, advice-taking, is measured on an 11-point Likert scale; higher values on the variable represent more advice-taking from the human/algorithmic adviser (depending on the type of adviser condition). Type of adviser is manipulated between subjects on two levels: human/algorithm. In the human adviser condition, the advice comes from a human adviser. In the algorithmic adviser condition, the advice comes from an artificial intelligence tool. Accountability is manipulated between subjects on two levels: low/high. In the accountability high (low) condition, participants are informed that they are (not) being held accountable for their reported forecast. Source: Authors’ own work
The figure shows two adjacent line graphs. The details of the graphs are as follows: The left line graph is titled “Panel A: Advice-taking in the human adviser condition”. The vertical axis ranges from 5 to 8 in increments of 0.5. The horizontal axis is labeled with two categories: “Advice rationale absent” on the left and “Advice rationale present” on the right. Two lines are shown in the legend at the bottom: a solid gray line labeled “Low accountability” and a solid black line labeled “High accountability.” The line for “Low accountability” starts at approximately (Advice rationale absent, 5.57) and increases with a positive slope to (Advice rationale present, 6.79). The line for “High accountability” starts at (Advice rationale absent, 5.66) and increases with a steeper positive slope, and ends at (Advice rationale present, 7.88). The right line graph is titled “Panel B: Advice-taking in the algorithmic adviser condition”. The vertical axis is labeled “Advice-taking” and ranges from 5 to 8 in increments of 0.5. The horizontal axis is labeled with two categories: “Advice rationale absent” on the left and “Advice rationale present” on the right. Two lines are shown in the legend at the bottom: a solid gray line labeled “Low accountability” and a solid black line labeled “High accountability.” The line for “Low accountability” starts at (Advice rationale absent, 5.70) and increases with a positive slope to (Advice rationale present, 7.55). The line for “High accountability” starts at (Advice rationale absent, 6.14) and increases with a steeper positive slope, and ends at (Advice rationale present, 7.80). Note: All the numerical values are approximated.Effects of the manipulations on advice-taking. Notes: Our dependent variable, advice-taking is measured on an 11-point Likert scale; higher values on the variable represent more advice-taking from the human/algorithmic adviser (depending on the type of adviser condition). Type of adviser is manipulated between subjects on two levels: human/algorithm. In the human adviser condition, the advice comes from a human adviser. In the algorithmic adviser condition, the advice comes from an Artificial intelligence tool. Advice rationale is manipulated between subjects on two levels: absent/present. In the advice rationale present (absent) condition, the advice from the human/algorithmic adviser [depending on the type of adviser condition] is (not) accompanied by an advice rationale. Accountability is manipulated between subjects on two levels: low/high. In the accountability high (low) condition, participants are informed that they are (not) being held accountable for their reported forecast. Source: Authors’ own work
The figure shows two adjacent line graphs. The details of the graphs are as follows: The left line graph is titled “Panel A: Advice-taking in the human adviser condition”. The vertical axis ranges from 5 to 8 in increments of 0.5. The horizontal axis is labeled with two categories: “Advice rationale absent” on the left and “Advice rationale present” on the right. Two lines are shown in the legend at the bottom: a solid gray line labeled “Low accountability” and a solid black line labeled “High accountability.” The line for “Low accountability” starts at approximately (Advice rationale absent, 5.57) and increases with a positive slope to (Advice rationale present, 6.79). The line for “High accountability” starts at (Advice rationale absent, 5.66) and increases with a steeper positive slope, and ends at (Advice rationale present, 7.88). The right line graph is titled “Panel B: Advice-taking in the algorithmic adviser condition”. The vertical axis is labeled “Advice-taking” and ranges from 5 to 8 in increments of 0.5. The horizontal axis is labeled with two categories: “Advice rationale absent” on the left and “Advice rationale present” on the right. Two lines are shown in the legend at the bottom: a solid gray line labeled “Low accountability” and a solid black line labeled “High accountability.” The line for “Low accountability” starts at (Advice rationale absent, 5.70) and increases with a positive slope to (Advice rationale present, 7.55). The line for “High accountability” starts at (Advice rationale absent, 6.14) and increases with a steeper positive slope, and ends at (Advice rationale present, 7.80). Note: All the numerical values are approximated.Effects of the manipulations on advice-taking. Notes: Our dependent variable, advice-taking is measured on an 11-point Likert scale; higher values on the variable represent more advice-taking from the human/algorithmic adviser (depending on the type of adviser condition). Type of adviser is manipulated between subjects on two levels: human/algorithm. In the human adviser condition, the advice comes from a human adviser. In the algorithmic adviser condition, the advice comes from an Artificial intelligence tool. Advice rationale is manipulated between subjects on two levels: absent/present. In the advice rationale present (absent) condition, the advice from the human/algorithmic adviser [depending on the type of adviser condition] is (not) accompanied by an advice rationale. Accountability is manipulated between subjects on two levels: low/high. In the accountability high (low) condition, participants are informed that they are (not) being held accountable for their reported forecast. Source: Authors’ own work
4.2 Hypotheses testing
We formally test our hypotheses using a regression model. Table 2 shows the results of regressing advice-taking on the experimental manipulations. Given the differences in the standard deviation observable in Table 1, we chose a regression with robust standard errors [4]. As the hypotheses are directional, they are tested using one-tailed tests. For all other coefficients, we report two-tailed tests.
Regression analyses used to test our hypotheses
| Panel A: All cases | Panel B: Human | Panel C: Algorithm | |||||
|---|---|---|---|---|---|---|---|
| Coeff. | p value | Coef. | p value | Coef. | p value | ||
| Advice rationale | 1.198 | 0.002 a | Advice rationale | 1.198 | 0.003 a | 1.852 | 0.000 a |
| Accountability | 0.063 | 0.790 | Accountability | 0.063 | 0.790 | 0.467 | 0.243 |
| Type of adviser | 0.091 | 0.763 | |||||
| Advice rationale × Accountability | 1.025 | 0.039 a | Advice rationale × Accountability | 1.025 | 0.039 a | −0.227 | 0.738 |
| Advice rationale × Type of Adviser | 0.653 | 0.291 | Constant | 5.576 | 0.000 | 5.667 | 0.000 |
| Accountability × Type of adviser | 0.404 | 0.384 | |||||
| Simple Effects | |||||||
| Advice rationale @ Accountability = 1 | 2.223 | 0.000 a | 1.625 | 0.001 a | |||
| Advice rationale × Accountability × Type of adviser | −1.251 | 0.081 a | |||||
| Accountability @ Advice rationale = 1 | 1.088 | 0.021 a | 0.240 | 0.662 | |||
| Constant | 5.576 | 0.000 | |||||
| N | 248 | N | 129 | 119 | |||
| R_Sq | 0.236 | R_Sq | 0.263 | 0.202 | |||
| Panel A: All cases | Panel B: Human | Panel C: Algorithm | |||||
|---|---|---|---|---|---|---|---|
| Coeff. | p value | Coef. | p value | Coef. | p value | ||
| Advice rationale | 1.198 | 0.002 a | Advice rationale | 1.198 | 0.003 a | 1.852 | 0.000 a |
| Accountability | 0.063 | 0.790 | Accountability | 0.063 | 0.790 | 0.467 | 0.243 |
| Type of adviser | 0.091 | 0.763 | |||||
| Advice rationale × Accountability | 1.025 | 0.039 a | Advice rationale × Accountability | 1.025 | 0.039 a | −0.227 | 0.738 |
| Advice rationale × Type of Adviser | 0.653 | 0.291 | Constant | 5.576 | 0.000 | 5.667 | 0.000 |
| Accountability × Type of adviser | 0.404 | 0.384 | |||||
| Simple Effects | |||||||
| Advice rationale @ Accountability = 1 | 2.223 | 0.000 a | 1.625 | 0.001 a | |||
| Advice rationale × Accountability × Type of adviser | −1.251 | 0.081 a | |||||
| Accountability @ Advice rationale = 1 | 1.088 | 0.021 a | 0.240 | 0.662 | |||
| Constant | 5.576 | 0.000 | |||||
| N | 248 | N | 129 | 119 | |||
| R_Sq | 0.236 | R_Sq | 0.263 | 0.202 | |||
Note(s): Results are obtained using regression analysis with robust standard errors. p-values are two tailed, except where used for hypothesis testing, marked with an a
The dependent variable, advice-taking is measured on an 11-point Likert scale; higher values on the variable represent more advice-taking from the human/algorithmic adviser (depending on the type of adviser condition)
Type of adviser is an indicator variable equal to 1 (0) when the adviser was an algorithm (when the adviser was a human)
Advice rationale is an indicator variable equal to 1 (0) when an advice rationale was present (absent)
Accountability is an indicator variable equal to 1 (0) when accountability is high (low)
Source(s): Authors’ own work
Considering the hypotheses, we first turn to Panel A of Table 2, in which we regresses advice-taking on an indicator variable type of adviser (equal to 1 (0) when the adviser is algorithmic (human)), an indicator variable advice rationale (equal to 1 (0) when an advice rationale is present (absent)), an indicator variable accountability (equal to 1 (0) when accountability is high (low)) and all potential two-way interactions of the three variables as well as the three-way interactions. We find a positive interaction between advice rationale and accountability (coefficient = 1.025, p = 0.039, one tailed), indicating that for a human adviser, receiving an advice rationale with the advice (vs not receiving it) increases managers’ advice-taking more when managers’ accountability is high than when it is low. This finding supports hypothesis 1. The three-way interaction is negative (coefficient = −1.251, p = 0.081, one tailed), indicating that the extent to which receiving an advice rationale (vs not receiving it) increases managers’ advice-taking more under high accountability than under low accountability is reduced when the adviser is an algorithm compared to when the adviser is human. This supports hypothesis 2.
Panel B and Panel C split the sample into the human advisor- and algorithmic adviser subsample and thus corresponds to Panel A and Panel B from Figure 2. Panel B shows that for the human adviser, an advice rationale increases advice-taking under low (coefficient = 1.198, p = 0.003, one-tailed) and under high accountability (coefficient = 2.223, p < 0.001, one-tailed). However, as already reported above, the increase is stronger under high accountability (coefficient = 1.025, p = 0.039, one-tailed).
Panel C shows that for human advisory, an advice rationale also increases advice-taking under both low (coefficient = 1.852, p < 0.001, one-tailed) and high accountability (coefficient = 1.625, p = 0.001, one-tailed). The insignificant interaction term, however, suggests that the effect of an advice rationale on advice-taking is not amplified by accountability (coefficient = −0.227, p = 0.738, two-tailed).
4.3 Robustness checks
As outlined in the method section, our participants have substantial professional experience. To show the robustness of our results, we first test whether our results are robust to controlling for participants’ professional experience. Including professional experience as a control variable in our regression used for hypotheses testing did not affect our results while experience itself is insignificant (p = 0.286, two-tailed).
Moreover, we asked participants about their current organizational level. To test the robustness of our results, we drop participants who indicate that they work in a non-management position (note that this does neither necessarily mean that they had not worked in a higher position before nor that they cannot relate to the situation) and re-run our regression used for hypotheses testing on this subsample. We find that none of our inferences is changed in direction or significance (H1: p = 0.002, one-tailed; H2: p = 0.085, one-tailed; R2 = 0.309). This gives us strong confidence that our results are robust and generalizable.
Finally, as we used a forecasting setting in our experiment, we also asked participants in the post-experimental questionnaire to indicate their experience with forecasting. We find that our results are also robust to including forecast experience as a covariate, with the variable itself being insignificant (p = 0.793, two-tailed) [5].
4.4 Additional evidence
In this final section, we report some additional effects we observe in our experiment, although we did not develop formal predictions on these findings. As the experiment was not designed to specifically examine these effects, we caution against over-generalizing from these observations.
First, the insignificant coefficient for adviser type in Panel A of Table 2 (coefficient = 0.091, p = 0.763, two-tailed) indicates that we do not find evidence of an unconditional algorithm aversion or algorithm appreciation in our experiment. Moreover, the insignificant effect of accountability in Panel B of Table 2 (coefficient = 0.063, p = 0.790, two-tailed) is little surprising, as it suggests that when both human advice sources come without an advice rationale, accountability does not increase advice-taking from the second advice source. The insignificant effect of accountability in Panel C of Table 2 (coefficient = 0.467, p = 0.243, two-tailed), however, is more interesting, as it suggests that accountability per se does not lead to algorithm aversion or appreciation, that is, when the advice of the baseline human adviser and the algorithmic adviser come without an advice rationale, accountability alone does not alter advice-taking. We will further elaborate on this finding in our discussion section.
5. Discussion
There is a vast literature on factors influencing advice-taking from human advisers (e.g. Bailey et al., 2023; Landis et al., 2022; Kämmer et al., 2023) and from algorithms (e.g. Glikson and Williams Woolley, 2020; Mahmud et al., 2022) and our paper bridges these distinct literatures by examining how accountability moderates the impact of advice rationales on managers’ advice-taking from human compared to algorithmic advisers. We find that while providing advice rationales generally increase advice-taking, the effect of accountability on this relationship differs depending on whether the adviser is human or algorithmic. Specifically, for human advisers, accountability amplifies the positive effect of an advice rationale on advice-taking. In contrast, for algorithmic advisers, the positive effect of an advice rationale on advice-taking is consistently high, regardless of the level of accountability. These findings provide nuanced insights into the interplay of the type of adviser, advice rationale and accountability in managerial decision-making contexts and thus provide important contributions to multiple streams of literature.
5.1 Theoretical implications
First, our research contributes to the literature on advice-taking from human advisers by highlighting the amplifying effect of accountability on the relevance of an advice rationale. In most managerial contexts, users of advice are embedded in accountability relationships, where decisions have to be justified to various degrees, and the level typically also depends on the organizational culture (Andersen, 2000; Dane et al., 2012; Graham et al., 2014). While prior studies indicate that advice rationales increase advice-taking (e.g. Bonaccio and Dalal, 2006; Tzioti et al., 2014), we extend this line of research by developing theory and providing evidence that this effect is more pronounced under high accountability. That is, by linking advice-taking to the potential justification pressure when making decisions on behalf of others, our findings advance theory by integrating mechanisms from accountability research (e.g. Lerner and Tetlock, 1999) into advice-taking contexts, where this linkage has been largely underexplored. We advance this line of research by showing that having an advice rationale directly with the advice is of comparatively less importance when managers operate under low accountability, as they can potentially solicit an advice rationale should they need it. However, under high accountability, availability of an advice rationale is more important for advice-taking, as managers in such settings can quite certainly expect to have to justify their decision and the advice rationale may be helpful for that.
Second, we add to the growing body of research on algorithmic advice. There is a vast and growing literature on how to integrating algorithms in varying management functions and tasks (Hillebrand et al., 2025), be it in the form of using forecasts or classifications (Commerford et al., 2022, 2024), in the form of integrating algorithms in the firms’ external relations, e.g. with customers (Haupt et al., 2025), or internal relations, dealing with employees (Revillod, 2024; Tandon et al., 2025). We focus on taking advice from algorithms, which has been shown to increase when the algorithmic advice becomes more transparent due to an advice rationale (Ning et al., 2024; Park and Yoon, 2024). Our findings suggest that the positive effect of an advice rationale on advice-taking from algorithms is consistently high, irrespective of the level of accountability. This contrasts with the human adviser scenario, where accountability moderates the effect. The inability of algorithms to provide on-demand advice rationales places them at an inherent disadvantage in terms of transparency (Gönül et al., 2012; Goodwin et al., 2013). Consequently, managers always value immediate rationales from algorithms to reduce the “black box” situation (Guidotti et al., 2019).
Our observation that we do not find any unconditional algorithm aversion or appreciation in our experiment and especially that accountability per se does not lead to algorithm aversion or appreciation (i.e. we do not observe an accountability main effect) may also be of interest to this stream of literature (e.g. Logg and Schlund, 2024), although we caution from over-generalizing from such findings. The latter is due to the fact that we did not hypothesize such a null effect and thus did not design our experiment to specifically test this. However, post hoc, we conjecture that two countervailing forces might be at play in our setting that could explain the absence of an accountability main effect in the algorithmic adviser case: In the absence of a rationale for both the human- and algorithmic advice given in that case (i.e. the baseline advice from the human adviser and the second advice from the algorithm in our setting), one might expect that managers tend more toward the baseline human advice, as such an adviser might be asked for an advice rationale later. However, at the same time, such a decision would imply that managers engage in more discarding of the algorithmic advice. Consequently, it could also be the case that when there are two intransparent inputs, justifying rejecting the algorithmic advice in favor of human advice feels more difficult for managers. These countervailing motives might lead to an overall absence of an effect of accountability in that condition, although this reasoning is highly speculative.
Finally, our study contributes to the broader accountability literature by demonstrating that accountability does not universally alter reliance on different types of advisers (human vs algorithm) but specifically affects the weight managers place on an advice rationale from a human. This finding nuances the divergent views on the consequences of accountability (e.g. Lerner and Tetlock, 1999; Aleksovska et al., 2019) and suggests that accountability amplifies the need for justifiable and transparent inputs, especially from human advisers. However, our results also suggest that accountability alone does not lead to algorithm appreciation or aversion as outlined above.
5.2 Practical implications
For managerial practice, our results emphasize the importance of providing advice rationales, especially when managers are operating under high accountability. This is particularly relevant in settings such as planning, budgeting or forecasting, where managers must often justify their decisions to superiors or external. Generally spoken, organizations should encourage human advisers to accompany their recommendations with clear rationales to enhance their credibility and facilitate decision-making. However, such rationales by human advisers are not of constant importance. They are crucial in environments where managers receiving advice must potentially justify their decisions to superiors. Given the costs that a human advice-giver needs to bear in order to provide an advice rationale, firms should also consider the cost–benefit trade-off of providing advice rationales from human advisers. In settings of low accountability, the benefits of providing an immediate advice rationale for the advisee may not outweigh the costs for the adviser. At the same time, it appears crucial that firms in such cases implement organizational structures that allow seeking an advice rationale later, should it be necessary. As such structures also entail costs, a solid cost–benefit tradeoff becomes necessary.
Regarding algorithmic advisers, our findings suggest that providing an advice rationale is essential, regardless of the accountability under which a potential advice-taker operates. Since algorithms cannot generate explanations on demand, integrating explainable AI features becomes critical for increasing advice-taking from algorithms (Ye and Johnson, 1995; Blanco-Justicia et al., 2020). Organizations implementing algorithmic decision-support tools should ensure that these systems include transparent and understandable explanations of their outputs to mitigate the “black box” concern and promote reliance on their recommendations. Given that such functionalities are a one-time investment but essential for algorithmic advice-taking, the cost–benefit assessment appears to be more straightforward in case firms want to promote that managers use the advice of the algorithmic tools they implement.
Moreover, our study indicates that when organizations aim to promote the use of algorithmic advice, merely holding managers accountable may not suffice. Instead, providing immediate and comprehensible advice rationales is more effective in enhancing advice-taking from algorithms. In turn, our findings also imply that holding managers accountable does not deter managers from relying on intransparent algorithmic advice (i.e. algorithmic advice that is not accompanied by an advice rationale). Given the potential negative effects that holding managers accountable may entail (e.g. Halachmi, 2002; Koppell, 2005), firms may thus want to consider refraining from holding managers accountable in some settings.
5.3 Limitations and future research
Like all research, our study has limitations, which must be considered when interpreting its results while also providing opportunities for future research. First, in order to provide a highly internally valid test of our theory, we intentionally abstract away various contextual variables that may also affect advice-taking, such as company culture and hierarchical structures, which might strengthen or weaken our results. Many parts of an organizational culture would likely be inherently confounded with accountability and preclude a clean manipulation. Furthermore, to the extent that our participants have a differing background with regard to organizational culture, this adds individual-level noise to our statistical analysis and only works against us finding support for our predictions. Notwithstanding this, future research can build on our study by developing and testing theory how advice-taking is affected by advice rationales under varying levels of such contextual variables. We would deem such efforts very valuable, especially considering that certain aspects of corporate cultures likely also affect the level of accountability an individual feels, irrespective of the formal accountability mechanisms at play. As substantial parts of research on advice-taking (from humans and algorithms) use experimental methods, in doing so, future research might also collect other data, such as data from the field, to show that the observed effects persist beyond controlled experimental settings and are robust in real-world contexts.
Second, consistent with related research (e.g. Belkin and Kong, 2018; Tzini and Jain, 2018; Logg et al., 2019; Cooper, 2024), we recruited a sample of online workers for our experiment. The concept of a “management function” is inherently broad and varies across industries and countries. For example, in the United States (where we recruited our participants), many employees hold positions that involve at least some managerial elements, such as overseeing team performance or engaging in some planning. These responsibilities often fall under the umbrella of “management”, even if the individuals are not upper-level executives. MTurk participants often represent this diverse workforce. It is also especially these kinds of employees who pass on information and are being held accountable by upper-level management and thus the ones we aim to generalize to with our theory. However, it is eventually an empirical question whether our findings generalize to upper-level executives’ behavior and future research testing this would be very welcome.
Relatedly, we also note that individuals’ specific background and specific experience (i.e. in one type of industry) might affect the effects we examine. We believe that our use of a broad sample is not a threat to the generalizability of our results because the diverse background only adds more noise to the data and thus work against us finding support for our predictions. We also argue that using a sample of participants with diverse backgrounds to test a generic research question that does not specifically focus on a specific industry or function is a sound approach. Nevertheless, it is ultimately an interesting empirical question how accountability moderates the effects of advice rationales on managers’ advice-taking from human- and algorithmic advisers in various industries that future research may investigate, for example using a field experiment with employees who have specific experience in a specific organization. Furthermore, although we argue that our experimental task only requires that participants understand they are making a decision on behalf of others (and may be held accountable for that) while having to rely on information from someone else, we acknowledge that we used a specific case of a management accountant working on a forecast and it is thus possible that not all participants were fully able to envision the scenario. While this should also only add more noise to the data and thus work against us finding support for our predictions (and not explain the directional effects we predict and find), we urge future experimental research to test the robustness of our results using scenarios with different tasks and functions.
Finally, we note that – although we also did not predict such an effect – we do not find an accountability main effect as discussed above, which is interesting in itself. While beyond the scope of our study, future research that disentangles the two countervailing forces that we argue could account for this finding (i.e. in case of having to choose whether to rely on advice without an advice rationale from a human- or an algorithmic adviser, managers tend more towards human advice because they might approach the human adviser for a rationale later vs managers feel that discarding algorithmic advice might be difficult to justify) would be highly interesting. Specifically, it would be interesting to see under what conditions either of these forces dominates. In further examining the divergent effects of accountability, such future research could also examine different forms of accountability, such as outcome-vs process accountability (e.g. Dalla Via et al., 2019) and whether certain types of advice rationale may make the adviser appear less credible or competent and may thus lead to lower advice-taking under higher levels of accountability.
Funding: Barbara E. Weißenberger and Peter Kotzian gratefully acknowledge the financial support from the Jürgen Manchot Foundation, Düsseldorf, within the Manchot Research Group “Decision-making with the Help of Artificial Intelligence”.
Notes
We note that there is tension in our argument. While we focus on the presence of an advice rationale per se, certain features of advice rationales in practice likely amplify or mitigate the moderating effect we predict. For example, advice rationales that seem weak or generic or that reflect negative on the competence of the adviser might lead to lower usage of advice under high accountability. In contrast, some features may make an advice rationale so convincing that a manager feels unable not to rely on the advice, and thus the level of accountability may not matter in such cases. We leave these interesting boundary conditions to our theory for future research to investigate.
If at all, the alternative advice source (“Mr. Smith”) in our dependent variable could only explain potential main effects between our type of adviser treatment. However, neither do we develop theory on such a comparison nor do we analyze our data that way. The alternative advice source cannot explain the directional, interaction effects of advice rationale and accountability we predict and find. For the same reason, we argue that even if our dependent variable should have caused some social desirability bias (e.g. Maas and Shi, 2024), this cannot explain the directional, interaction effects of advice rationale and accountability we predict and find.
The use of online samples is quite common in the literature on advice-taking from humans and algorithms (e.g. Belkin and Kong, 2018; Logg et al., 2019; Cooper, 2024; Tzini and Jain, 2018). Hence, using a similar participant pool makes our results more comparable to extant literature.
We generally chose a regression model over an ANOVA as the coefficients directly relate to the simple effects and thus ease interpretation as compared to ANOVA. However, in case of experimental data with categorical variables for manipulated factors, both ANOVA and regression ultimately lead to the same inferences (e.g. Field, 2013). This is also the case in our experiment: all our inferences are unchanged when we use ANOVA instead of regression.
We also test and find that our results are robust to controlling for the size of participants’ organizations (p = 0.985, two-tailed) and their risk attitude (p = 0.667, two-tailed).
