Service innovation is a key source of competence for service enterprises. Along with the emergence of crowdsourcing platforms, consumers are frequently involved in the process of service innovation. In this paper, the authors describe the crowdsourcing ideation website—MyStarbucksIdea.com—and find the motivations of customer-involved service innovation.
Using a rich data set obtained from the website MyStarbucksIdea.com, a dynamic structural model is proposed to illuminate the learning process of consumers.
The results indicate that initially individuals tend to underestimate the costs of the firm for implementing their ideas but overestimate the value of their ideas. By observing peer votes and feedbacks, individuals gradually learn about the true value of ideas, as well as the cost structure of the firm. Overall, the authors find that the cumulative feedback rate and the average potential of ideas will first increase and then decline.
First, the previous researches concerning the crowdsourcing show that the creative implementation rate is low and the number of creative ideas decreases, and few scholars have studied the causes behind the problems. Second, the data used in this paper are true and valid, and it is difficult to obtain now. These data can provide strong empirical support for the model proposed in this paper. Third, it is relatively novel to combine the customer learning mechanism and heterogeneity theory to explain the phenomenon of reduced creativity and low implementation rate in crowdsourcing platform, and the research results can provide a reasonable reference for the construction of this industry.
1. Introduction
Service innovation is the source of business competitiveness. Recently, along with the development of information technology, crowdsourcing is beginning to gain popularity in various fields. Howe (2006) defined “crowdsourcing” as “the new pool of cheap labor: everyday people using their space cycles to create content, solve problems, even do corporate R and D. Such initiatives of crowdsourcing provide a platform for everyone to post their own ideas, and these ideas are usually generated from direct or indirect service experience. Therefore, the customer group is a rich source of preference information. A typical crowdsourcing platform allows customers to support or oppose others' ideas, so that the preliminary assessment of existing ideas can be obtained. Through such initiatives, the firm can acquire a great number of ideas that are innovative and beneficial. However, arguments about the real utility of crowdsourcing have never been settled. In fact, many crowdsourcing platforms are experiencing a decrease in the number of new-posted ideas, and the feedback rate (the percentage of ideas with official feedbacks in all existing ideas) remains low. Nevertheless, we have not found enough systemic and in-depth researches on these problems.
The majority of researches on crowdsourcing are aimed at crowdsourcing contests, in which customers post ideas to compete to win an award (Archak and Sundararajan, 2009; DiPalantino and Vojnovic, 2009; Mo et al., 2011; Terwiesch and Xu, 2008). Unlike crowdsourcing contests, in a permanent and open idea solicitation such as MyStarbucksIdea.com, there is no competition among idea contributors; instead, they help each other evaluate ideas. Unfortunately, only a few studies are conducted on this type of crowdsourcing initiatives. Through a reduced form approach, Bayus (2013) finds that individual creativity is positively related to current efforts, while negatively related to previous success. Di Gangi et al. (2010) find that there are two factors demonstrating whether an individual's idea will be adopted — the firm's ability to understand the technical requirements and to give feedback to concerns for ideas in the community. Lu et al. (2011) find that in crowdsourcing ideation initiatives, complementarities, and customer support provide people with chances of learning. This mechanism allows people to know about the problems that other customers have encountered, and help them come up with more ideas that are worth implementing. Yan et al. (2014) study the crowdsourcing platform IdeaStorm.com, which is affiliated to Dell. Their work has pioneered a research direction of structurally investigating new product ideas, as well as their development process, on the basis of real crowdsourcing data.
2. Data collection and analysis
Our data are collected from the crowdsourcing website MyStarbucksIdea.com, which is affiliated to Starbucks. This online community was established in March 2008, and it is dedicated to sharing and discussing ideas and allowing people to see how Starbucks is putting top ideas into action.
The structure of MyStarbucksIdea.com is simple, but quite efficient. Anyone (not only the customer of Starbucks) can register at the website and become a member of the online community. Afterward, anyone who owns the membership can post ideas on the website. Starbucks classifies all ideas into three categories—product ideas, experience ideas, and involvement ideas. Before an individual posts an idea, he or she shall select the category that the idea belongs to. As long as an idea is posted, other members can vote for it. If one supports an idea, he or she can submit a positive vote, which will add 10 points to the idea. And if one opposes an idea, he or she can submit a negative vote, which results in a deduction of 10 points. On the website, however, only the cumulative scores are available, while the specific number of positive and negative votes is not open to the public. Moreover, registered members can write their comments under an idea to explain detailed thoughts.
Typically, the change of an idea's status contains the following stages. Once an idea is posted, the voting and commenting function is then available to the public. The review team will select all ideas according to the scores, and deliver the good ones to the decision-makers. At this moment, the status of these ideas changes to “under review.” For those ideas that are already reviewed, the status becomes “reviewed.” Next, ideas that are worth implementing are selected, and their status evolves into “coming soon.” Finally, when an idea is completely implemented, its status changes to “launched.” In our paper, customers' learning process is gradually advancing based on two information sources—the scores and status of ideas.
There is a rich data set on MyStarbucksIdea.com, which contains detailed information about both ideas and members. We collected the public data on the website, and divided it into two groups—idea profile data and member profile data. We acquired 96,793 records of member profile data, and selected 21,305 individuals who posted more than two or more ideas (these individuals are called “selected members”). The time ranged from January 2009 to December 2015. We found a similar distribution between ideas contributed by selected members and those by the whole member groups, so the selected member group is representative.
In Figure 1, we present the relationship between the cumulative feedback rate of the three categories and time.
3. Model construction
Taking Yan et al. (2014) as reference, we modify the structural model to describe the decision-making process of costumers, and further explain the data generation process. Through the explicit modeling of individuals' utility function, we can use the data to empirically recover the parameters in the analytical model.
In each time period of our model, every member will make the decision about whether to post an idea in a certain category, but whether to put the decision into practice is determined by the corresponding utility expectation. Thus, we first explain the utility function.
Suppose that individuals are indexed i, j is the index of categories (j = 1, 2, 3), and t denotes time. The utility function consists of four factors. The first and second factors are related to benefit, which means that if an idea is implemented, the contributor will obtain better service experience, higher online reputation, or even job opportunities. We use the parameter ri to represent the reputation gain. Then, the third factor is the cost for posting an idea, including thinking, articling, and posting the idea. The whole cost is denoted as ci. Finally, the fourth factor involves the discontent; for example, if an idea is not accepted or responded, the contributor will gain discontent with such situations. In the model, whether individual i is discontent in period t is denoted as Dit (a binary variable, Dit = 1 means that the person is discontent, while Dit = 0 means content). In addition, the degree of such discontent is measured by the parameter di.
Hence, the utility function is given by the following equation:
where represents the utility function when individual i posts a category idea in period . The parameter measures the utility gain for individual posting a category idea, and the error term captures the random shock of decision in period . Since is linearly correlated with , we cannot obtain the specific value of both of them. Thus, we combine them in a new parameter , and we have .
When an individual posts an idea, he or she will hold a belief that the idea will be accepted based on the existing information, that is, the expectation of the utility function, which is represented as .
where denotes the probability of acceptance based on the online information.
Every member holds an expectation of the cost and value of their idea, and they update the expectation through the information from the website. Meanwhile, they will learn about the firm's cost structure and the true value of their idea.
Suppose that the cost for implementing an idea in category follows a normal distribution . Furthermore, we assume that the firm exactly knows the actual cost, while members do not. The prior belief of individuals is that the average cost for implementing an idea in category is , and it follows a normal distribution . If an idea is implemented, all individuals will receive the same implementation signal, and the voting function of the implemented idea will be closed.
We allow to represent the cost signal that all individuals receive, and let to represent the mean value of all cost signals in the same category, which follows the distribution . The variance indicates the difference among specific cost signals. This means that the cost signals are unbiased, but noisy
In a certain period of time, there could be more than one idea implemented. If there are ideas in category implemented in period , the accumulated cost signal that an individual will receive can be measured by , which is the average of , and it follows the following distribution
We allow to denote one's belief of the cost for implementing an idea in category at the beginning of period . Based on cumulative information, the updating process of performs as follows (DeGroot, 1970).
The prior in period is .
We calculate the voting score that one's ideas receive in different categories, and we find no significant difference. Additionally, we have verified that one's abilities of coming up with new ideas are not influenced by the learning curve effect. Therefore, we can assume that the value of one's idea remains similar and will not change over time. As soon as individual enters the website, her prior belief of the value can be written as
Moreover, the voting score is an excellent measurement, suppose that the natural logarithm of the voting score of an idea is linearly correlated with and the value of the idea.
For individual , the prior belief of the natural logarithm of the voting score can be expressed as
We use the parameter to represent the average value of all ideas submitted by individual , and we use to denote the value of a specific idea raised by individual in period . Then, we have the following expressions
where denotes the deviations from the mean value, while the value of varies from individual to individual and changes over time.
Note that members will update their perception of their ideas' value through voting scores; suppose that the natural logarithm of the voting score that a specific idea receives is
where
Here, is the mean value of , and is the deviation from .
Similarly, we define and to represent one's belief of the value of an idea in category , as well as its voting score, at the beginning of period . Thus, the updating process of and follows the same rule as (Erdem et al., 2008):
Respectively, we have
In addition, the prior values in period are
The firm will take a comprehensive account of the idea's cost and value when it decides which idea to put into implementation. Suppose that the firm only accepts ideas that can bring positive net profit, and we allow the parameter to represent the net profit of implementing the th idea in category during period , and then we have
where denotes the true value, and denotes the actual cost.
As a result, the probability of implementing a specific idea can be written as
During the decision-making process, only the firm knows exactly about , while is available to both the firm and customers. From the perspective of the firm, and are known. However, for customers, is a random variable satisfying the following expressions
Thus, in terms of the community members, the probability of implementing an idea with the value can be expressed as
Let represent the decision of the firm, with , meaning that the idea is implemented, and otherwise. Given , , and , the likelihood of implementation is
As is mentioned above, members will refer to their utility function while making decisions about whether to post an idea or not. Suppose that they make their decisions independently and without the influence of idea category. Besides, they know that the firm will take into consideration both the cost and the value. Then, the in Equation can be written as
where
Note that is the information that individuals learn about the cost and value through the two learning processes mentioned previously. This information contains the value of , , , and , and evolves as members update their belief of , , , and , over time.
We assume that the parameter in Equation follows a type 1 extreme value distribution, indicating that the probability for individual posting an idea in category during period meets the standard logit form. Furthermore, let represent the event whether the idea is posted or not (if the idea is posted, then ; otherwise, ). Finally, the likelihood of posting the idea can be expressed as
4. Model estimation and result analysis
4.1 Parameter estimates and analysis of cost and value
Taking into consideration both amount and validity, we select a group of individuals who proposed three or more ideas during January 2009 and February 2016 to serve as model samples. Compared with other members, the selected individuals tend to behave more actively and learn faster through a variety of signals; therefore, they fit the model well (details are explained in the following sections). Our samples contain 371 members, together with 2,466 posted ideas. Next, we estimate the model parameters through MATLAB, and the results are summarized in Table I and Table II. We set the value of some parameters which remain constant among individuals (pooled parameters) in Table I, while the estimates of the other pooled parameters are presented in Table II.
Comparing the estimates of and , we find that the cost for the firm to implement a category 3 idea (an involvement idea) is the largest, the cost of implementing a category 2 idea (an experience idea) is lower, while the cost of implementing a category 1 idea (a product idea) is the lowest. In addition, and are all smaller than in terms of absolute value, which means members tend to underestimate the cost that the firm incurs when implementing an idea.
In Table II, we see that is smaller than that of , and in terms of absolute value, indicating the cost signal customers receive when an idea gets implemented is fairly precise. Since the total number of implemented ideas in every month remains small, these signals can speed up the learning process of the firm's cost structure. Then, we list the estimates of parameters that vary from individuals (individual-level parameters) in Table III.
In Figure 2, we plot histograms of the distribution of the individual-level parameters shown in Table III.
From Table III, we see that the average of , which represents the posteriors of the value of an idea posted by an individual, is larger than the initial value . Additionally, the variance of is quite large, meaning that the posteriors of the value of an idea significantly differs from each other among individuals. In fact, this is related to the personal posting behavior and the votes that an idea receives. That is, if a member submits more idea, he will learn the value of her ideas faster. Meanwhile, the more votes that an idea receives, the more actual value this idea will obtain.
We also observe that the average of is much smaller than the absolute value of , and , that is, the individual mean value of ideas is smaller than the average cost of implementing an idea. This is equivalent to saying that the firm will suffer a loss to carry out an idea, which is in agreement with the low feedback rate of ideas described before. Generally, the average and variance of , which denotes the variance of ideas' value for an individual, is quite large, meaning that an individual holds different cognition among her own ideas. Furthermore, we can see from the distribution of that individuals with roughly account for half of all selected members. The small value of implies a stable value of ideas submitted by individual . Moreover, good idea contributors usually post ideas of high value, while marginal idea contributors tend to submit ideas of low value. Thus, with a large average and variance means that the members' learning process of the value of ideas is not efficient enough, so that the filtering process of idea contributors proceeds slowly.
4.2 Analysis of parameters in utility gain
To explore the relationship between the individual mean value of ideas and other individual-level parameters, we present the scatter of and against in Figure 3 (a) ∼ (b), respectively.
As is shown in Figure 3 (a), most points gathered in the region of and , which means the average value of ideas posted by the majority of the selected members is slightly larger than zero, and the variance is large as well. The results indicate that good idea contributors only make up a small proportion of the selected members, and the general value of ideas stays at a low level. Interestingly, individuals' abilities to raise high-value ideas have not improved significantly through the learning process. Instead, their abilities tend to remain steady after they realized the true value of their ideas. In Figure 3 (b), most points gathered in the region of and (which is smaller than the average of , −1.22). From the analysis above, we know that although most individuals raise ideas of normal value, they are not as sensitive as those who usually post high-value or low-value ideas to the firm's feedback time. Thus, if the firm wants to collect more ideas of high value, it should promote the efficiency of responding to high-value ideas.
4.3 The filtering process of the crowdsourcing platform
Our estimates of the model parameters can explicitly represent the filtering process of idea contributors, especially marginal idea contributors, whose ideas are worse than the overall level. In the following analysis, we study the members who posted two or more ideas during the first and the last 20 months, respectively. Figure 4 (a) ∼ (b) illuminate the distributions of average value in the two subgroups.
In order to elaborate the difference in abilities of posting high-value ideas between new members and early members, we present the relationship between the time when an individual first conducted their posting behavior and the average value of her ideas in Figure 5. Through the gathering state of data points, we observe that there exists a great difference in average value among individuals who posted the first idea during the first 40 months (we call them “early members”). In addition, these points scattered across the coordinate plane, and several individuals have posted ideas better than the average level. Moreover, members who posted their first idea during the last 50 months (we call them “new members”) tend to submit ideas that are close to the overall mean value, and no obvious high-value proposer is found. In other words, good idea contributors who are also early members gradually become inactive, while new members cannot raise enough ideas that are as good as those of the previous contributors.
5. Conclusion
We modify an existing structural model to study customers' dynamic learning process using the actual data on MyStarbucksIdea.com. We research the efficiency of crowdsourcing initiatives in the background of customer-involved service innovation.
The results in our paper show that in the early stage of the website, members of the online community not only overestimate the value of their ideas, but also underestimate the cost for the firm to implement ideas. Therefore, members tend to be excessively optimistic and post a large number of new ideas with little value. Along with the learning process, individuals gradually realize the true value of ideas, and know about the firm's cost structure. For marginal idea contributors, the expectation of posting new ideas starts to drop. As a result, customers' learning process plays the role of self-selection, which filters out marginal idea contributors, leading to the decrease in the number of ideas after the website reaches a stable stage.
However, when the website reaches the stable stage (during the 30th and the 40th month in our model), the mean value of ideas and the cumulative feedback rate starts to fall. The reason is that only a few early members remain active; most individuals, including marginal contributors, gradually fade out. In addition, new members are not able to submit enough high-value ideas, so the vacancy for good contributors is not filled.





