This study examines whether large language models (LLMs), such as ChatGPT, can enhance individuals' opportunity recognition (OR) capabilities and whether their impact varies across different user groups.
An experimental study was conducted with 63 participants from diverse educational backgrounds in Germany. Participants were divided into an experimental and a control group and tasked with identifying business opportunities. The experimental group used LLMs, while the control group completed tasks without artificial intelligence support. OR capability was assessed by expert evaluation of identified opportunities.
LLMs significantly enhanced OR capability, with greater benefits observed among participants with initially lower OR capabilities. However, proficiency in LLM utilization showed no significant improvement in outcomes.
The findings suggest that incorporating LLMs into entrepreneurial education and training programs could significantly enhance participants' ability to recognize and develop business opportunities, particularly for those with initially lower OR capabilities.
This study provides the first empirical evidence on the role of LLMs in the entrepreneurial OR process, highlighting their potential to democratize entrepreneurship by supporting individuals with varying skill levels.
1. Introduction
In the contemporary era, artificial intelligence (AI) emerges as a transformative technology with the potential to revolutionize various sectors. AI, particularly large language models (LLMs) like ChatGPT, represents a significant leap in computational capabilities (Brown et al., 2020). These models can process vast amounts of data and generate human-like responses, supporting complex problem-solving and decision-making processes (Bommasani et al., 2021). Therefore, it functions on two different levels: as an integral part of innovative solutions and a potential transformative tool that enables the recognition of these entrepreneurial opportunities–thus driving innovation and entrepreneurship from a new angle.
So far, individuals' opportunity recognition (OR) capabilities are seen as a central ability in entrepreneurship, as it is already known from prior research that individuals who are better at identifying potential business opportunities are more likely to start their own businesses, and that the entrepreneurial process always begins with the identification of a business idea (Shane and Venkataraman, 2000). By demonstrating that LLM utilization may increase OR capability in entrepreneurship (Kaplan and Haenlein, 2019), this study could have wide-ranging implications, from challenging traditional viewpoints on OR to paving the way for new approaches to integrating AI into entrepreneurship education and ultimately inspiring more inclusive forms of entrepreneurship.
Despite the theoretical and practical potential, empirical research examining the impact of LLMs on OR capability remains limited. This is in line with Elia et al. (2020), who highlight the scarce discussion in the literature on the real impact of digital technologies on the entrepreneurial process. Since the transformational potential of digital technologies is known (Giuggioli and Pellegrini, 2023; Salamon, 2020; Kraus et al., 2019) but their overall impact on entrepreneurship cannot yet be fully assessed, it is essential to build knowledge on both a theoretical and practical level. This is supported by Thanasi-Boçe and Hoxha (2024), who highlight the potential of LLMs to support students and aspiring entrepreneurs in areas such as opportunity identification as well but also emphasizing the current lack of evidence-based validation. Accordingly, this study addresses this gap by providing the first empirical evidence on how LLMs influence a core entrepreneurial skill: OR.
Recognizing that OR is a multifaceted construct influenced by various factors, the authors build upon the framework proposed by Mary George et al. (2016), who identified six key pillars contributing to an individual's OR capability: prior knowledge, social capital, cognition, environmental conditions, entrepreneurial alertness and systematic search.
Within the study, a sample of 63 university students was drawn, who generally represent a critical demographic in entrepreneurship research due to their potential as future entrepreneurs (Basu and Virick, 2008). Furthermore, as digital natives, students are frequent users of LLMs, making them an ideal target group for studying their impact in this specific context. Moreover, their similar stages of personal and professional development offer a more homogeneous level of knowledge and experiences, thereby reducing potential confounding factors and enhancing the internal validity of the study.
The experimental design assesses the overall effect of LLM usage on individuals' ability to recognize and develop potential business opportunities based on a given scenario. This approach allows capturing the impact of LLM tools utilization on the OR process, providing valuable insights into how this technology may empower aspiring entrepreneurs.
2. Theoretical foundation
This section provides the theoretical foundation for this study by first outlining the evolution of digital technologies, AI and LLMs, followed by a review of OR in the entrepreneurial context, and concluding with the development of the hypotheses based on the intersection of these domains.
2.1 Digital technologies and the evolution of AI and LLMs
Digital technologies can significantly transform the way work is performed, impacting both intellectual professions (Salamon, 2020) and more traditional work environments. Their integration into daily workflows increases the speed and precision of execution (Trivelli et al., 2019) while influencing individual behavior and capabilities (Andriole, 2017; Anim-Yeboah et al., 2020). As a result, technologies can be viewed not only as an “input” factor (Giones and Brem, 2017; Davidson and Vaast, 2010), but also as an “enabling” factor (Sussan and Acs, 2017; Guthrie, 2014), driving new ways of working and altering the context in which individuals operate (Kraus et al., 2019). AI stands out as one of the most significant digital technologies transforming modern work practices, as it is assumed to have a more significant impact than both the industrial and digital revolutions combined (Giuggioli and Pellegrini, 2023).
AI is defined as a system's ability to interpret external data, to learn from it and to use those learnings to achieve specific goals and tasks through flexible adaptation (Kaplan and Haenlein, 2019). From a technological perspective, the architecture of AI systems is based on three fundamental layers (Taddy, 2018): domain structure, data generation and general-purpose machine learning (ML). The domain structure refers to the specialized knowledge needed for a problem and its context, while data generation pertains to the large datasets and the ongoing data creation required for training the AI and fueling the learning algorithms. The final piece of the puzzle is ML, which serves as the “core” of an AI system and is responsible for identifying patterns and making predictions from unstructured data. Based on these complex and flexible algorithms, tasks that traditionally required human cognition can now be automated (Duan et al., 2019; Nishant et al., 2020), broadening the scope for AI applications and commercial uses (Chalmers et al., 2021).
One of the latest developments in AI is the introduction of the LLM ChatGPT by OpenAI (AlAfnan et al., 2023). ChatGPT, fully termed as “Chat Generative Pre-trained Transformer,” is an AI model based on a highly effective neural network architecture named “Transformer architecture”, which is well known for its proficiency in text processing and generation (Teubner et al., 2023). To be precise, through text-based and chat-centric interactions (Rahman and Watanobe, 2023), the model allows users to receive coherent and meaningful answers based on the inputs, which in turn are made possible by its comprehensive corpus of textual data from diverse sources, such as books, articles and online content (Ausat et al., 2023; Rahman and Watanobe, 2023).
2.2 Opportunity recognition in entrepreneurship
To create business ideas that address a serious pain point by solving a significant problem or adding additional value for their customers, OR is seen as being highly important (Salimi, 2021). The ongoing academic debate surrounding OR centers on two theoretical perspectives: the discovery view and the creation view (Berglund et al., 2020; McBride and Wuebker, 2022). The discovery view posits that opportunities exist independently of individuals and are uncovered by perceptive entrepreneurs (Shane, 2000; Ramoglou and McMullen, 2024). According to this perspective, opportunities arise from external changes or imbalances in the environment, which can be recognized and exploited by individuals who are attuned to these shifts (Mary George et al., 2016). In contrast, the creation view suggests that entrepreneurs do not merely discover pre-existing opportunities but actively shape and construct them through their ventures and interactions (Sarasvathy, 2001; Alvarez and Barney, 2007).
Regardless of which theoretical perspective is adopted, researchers agree that OR is a fundamental component of entrepreneurship (Hanohov and Baldacchino, 2018). Prior research primarily focused on answering the questions of how, when and why some individuals can recognize opportunities, whereas others do not. As a result, a large number of studies attempt to unfold OR, identifying various influencing factors such as educational, professional and social contexts (Cooper and Park, 2008), entrepreneurship concepts and skills (Kourilsky and Walstad, 1998), specific market knowledge and experiential knowledge (Mejri and Umemoto, 2010), network ties (Hite, 2005), mentors (Ozgen and Baron, 2007), an individual's creativity (Baron and Tang, 2011) and mental connections (Grégoire et al., 2010). This broad array of influencing factors highlights the multifaceted nature of OR and underscores its importance as a central component of entrepreneurial success (Hanohov and Baldacchino, 2018).
Nevertheless, in a 2016 meta-analysis, Mary George et al. (2016) identified six factors as most prominent, namely prior knowledge, social capital, cognition, environmental conditions, entrepreneurial alertness and systematic search. Regarding prior knowledge, various studies indicate that personal and professional experience acts as a cognitive resource that enables entrepreneurs to identify and exploit opportunities (Audretsch, 2005; Vaghely and Julien, 2010). This knowledge can be categorized into understanding markets, ways to serve them and customer problems (Shane, 2000). It often needs to be combined with other sources, such as education and social connections, which assist people in their OR capability (Lettl et al., 2008; Van Gelderen, 2007). Prior knowledge thus serves as a crucial cognitive resource that enables entrepreneurs to synthesize diverse information for effective opportunity identification (Shane, 2000; Haynie et al., 2009). Social capital refers to access to critical information and resources (Ardichvili et al., 2003), which is enabled by diverse networks in different areas and places. In terms of OR, it is assumed that individuals benefit from their network as it allows them to recognize opportunities through pattern recognition and peripheral vision (Baron and Markman, 2000; Tang, 2010). Regarding cognition, Mary George et al. (2016) show the main attributes discussed include creativity, self-efficacy, the propensity to assume risks, the need for achievement, the need for independence and locus of control. Studies have demonstrated that high levels of intellectual capital and creativity are essential for identifying an opportunity (Ardichvili et al., 2003; Nicolaou et al., 2009; Ramos-Rodriguez et al., 2010), as the entrepreneur's creative capacity is higher than that of the average population (Heinonen et al., 2011). Environmental conditions, such as market trends, competition, regulatory frameworks and economic conditions, influence OR by sets of new information, which help in the processes of opportunity discovery or creation (Schumpeter and Nichol, 1934). That indicates that the existence of entrepreneurial opportunities is contingent on the availability of information about societal resources (Singh et al., 2008). However, individuals often lack knowledge about these resources and how to use them efficiently for opportunity discovery and exploitation (Shane and Venkataraman, 2000), which may explain why some people always discover opportunities and others never do. Entrepreneurial alertness refers to the ability to identify opportunities even without actively searching for them (similar to the concept of serendipity (Dew, 2009)) or by simply observing a phenomenon (Mary George et al., 2016), and is in turn also based on other factors such as prior knowledge and experiences, absorptive capacity and network ventures (Ardichvili et al., 2003). The last factor, systematic search, can be broadly characterized as the antithesis of OR through alertness, although some studies contest a clear juxtaposition (Murphy, 2011). Through active searching from a known information domain, it helps to obtain specific information to overcome a potential lack in prior knowledge (Mary George et al., 2016).
2.3 What is known about the impact of LLMs on entrepreneurship
From an entrepreneurial perspective, LLMs provide access to collective intelligence and knowledge, significantly reducing barriers between invention and new venture creation (Kelly, 2017). By changing how entrepreneurs interpret and respond to their environment, these technologies may enable the discovery of new opportunities and the development of more effective strategies to exploit them (Dellermann et al., 2020). Several emerging studies and reports have speculated about the potential of tools such as ChatGPT to enhance entrepreneurship, for example, by supporting opportunity identification, business simulation and business planning (Thanasi-Boçe and Hoxha, 2024). While such work underscores the promise of LLMs, it remains largely conceptual and offers little to no empirical evidence of their actual effectiveness. As noted by Thanasi-Boçe and Hoxha (2024) and Elia et al. (2020), there is a striking lack of research demonstrating whether LLMs can truly foster key entrepreneurial skills such as OR. This gap points to the need for empirical studies that move beyond assumptions and explore the tangible impact of LLMs on entrepreneurial outcomes.
2.4 Hypotheses development
While traditional influencing factors of OR are well studied (Mary George et al., 2016), the emergence of AI, specifically LLM, could point the way to a potential paradigm shift in recognizing entrepreneurial opportunities. Various authors have already suggested that digital technologies are generally transforming the nature and scope of entrepreneurial activity (Nambisan, 2017; Chalmers et al., 2021; Jami Pour et al., 2024). In fact, AI has been described as creating the most significant entrepreneurial opportunity in the history of civilization (Iansiti and Lakhani, 2020), as it enables entrepreneurs to identify new opportunities and introduce innovative products or services (Obschonka and Audretsch, 2020).
Several practical examples exist relating to how specified AI solutions enhance specific systems and processes (Jan et al., 2023). Even though these examples demonstrate how specialized and setting-specific AIs can improve the recognition of opportunities within certain tailored environments, it cannot be assumed that LLMs, as universal, human-centric tools, can assist individuals' OR capabilities in a broader environment such as general entrepreneurial opportunities, in the same way. In order to change this, the traditional pillars of OR provide a robust framework to discuss the potential impact of LLM on this dynamic domain.
Starting with prior knowledge, which is considered a cornerstone of OR (Mary George et al., 2016), the introduction of LLM could provide a key advantage. This technology, based on big data and advanced algorithms (Taddy, 2018), could serve as an extension of users' personal and professional knowledge (Hmoud et al., 2024). In entrepreneurship, where understanding markets, customer needs and service delivery is essential (Shane, 2000), LLMs could act as a valuable tool to bridge knowledge gaps (Kernan Freire et al., 2023). An indication for this comes from a study by Dell’Acqua et al. (2023), which demonstrated that users significantly benefited from having AI augmentation, particularly those participants with below-average skills required for a task. Therefore, the authors argue that the utilization of LLM can enhance OR capabilities, especially for individuals with a potential lack of know-how and thereby reduce differences in performance.
Regarding social capital, traditionally cultivated through networks and interpersonal relationships (Ardichvili et al., 2003), LLMs may not substitute face-to-face human interactions. However, it might provide an alternative access point to valuable information and resources. Xie et al. (2021) suggest that especially bridging social capital (meaning loose connections across a cleavage into interdisciplinary backgrounds) has a positive effect on OR capability compared to bonding social capital (referring to strong ties). By providing insights that the user is unaware of and might otherwise only be accessible through personal networks, LLM could alleviate the limitations created by the absence of such networks, opening the door to new, unexpected business opportunities.
Furthermore, cognition is a critical component of OR. In a 2023 study by Cropley, which is based on ChatGPT's performance on the Divergent Association Task (DAT), the AI shows significant promise in creativity, outperforming a substantial portion of human respondents. While the mean DAT score of GPT3.5 surpasses 64.93% of human scores, GPT4 exceeds 82.54%, suggesting a level of creativity superior to most humans (Cropley, 2023). Although the author himself cautions against generalizing these results due to the limited sample size, the findings nonetheless offer compelling first evidence of ChatGPT's high cognitive and creative capabilities. Nevertheless, when interpreting the results of studies like this, it should be noted that the test itself, the algorithm of the score calculation, as well as sample solutions, could be part of the training data and, therefore, the performance could potentially be explained by factors other than creativity.
In terms of environmental conditions, the influence of LLM seems unlikely in the first instance, as these are external and (given) circumstances. However, the ability to identify these (environmental) opportunities depends not only on their existence, but on the awareness of the problem's existence. In turn, this awareness of specific opportunities depends on the different structures of the network in which individuals are embedded (Arenius and Clercq, 2005). While people are likely to see situations from their unique perspective (Malhotra, 2004), LLMs could expand these boundaries and reflect a different perspective through the heterogeneity of the collected data. This expansive view through the utilization of LLMs could help users to exit their own “bubble” and overcome information asymmetries and the bias of human emotions, two factors that create barriers to developing solutions for environmental sustainability (Cullen‐Knox et al., 2017). For example, LLMs could help individuals adopt the target group's perspective to reveal insights and solutions that might otherwise be overlooked from an outsider's viewpoint, thus improving the individual's perspective on environmental conditions.
Entrepreneurial alertness can be defined as the ability to scan and search for information, linking previously disconnected information and assessing the existence of profitable business opportunities (Tang et al., 2012). According to Kirzner (1979), who introduced the term to the entrepreneurial literature, it is a unique perceptual ability that can at least partly be understood as an intuitive skill (Mitchell et al., 2005). Accordingly, it can be argued that the potential influence of LLMs on entrepreneurial alertness is less significant. Rather, it raises the question of the extent to which this specific part of this ability to recognize opportunities recedes into the background if LLMs can already independently make sense of seemingly unrelated information and offer specific opportunities namely (Alshater, 2022), making this instinctive ability potentially and partly obsolete.
The impact of LLMs on systematic search is considered enormous, as it not only changes the potential output but also the entire search process itself. While in the past we were busy gathering relevant information, LLMs now make it possible to deliver relevant data automatically within a very short time (Alshater, 2022), allowing users to concentrate on analyzing and interpreting this information. Accordingly, LLMs not only seem to improve quality (Cropley, 2023) through the targeted question-and-answer principle but also significantly increase efficiency by reducing the time required (Teubner et al., 2023; Bin-Nashwan et al., 2023). Overall, LLMs therefore could increase the range, speed and accuracy of systematic searches and enable users to identify more options in less time.
In summary, the authors postulate that the integration of LLMs into the OR process has the potential to influence and, in some cases, completely change the traditional pillars of OR. The adaptation of LLM tools could therefore represent a significant improvement in the ability to identify and utilize business opportunities.
The utilization of LLMs increases the OR capability of individuals.
Furthermore, based on the literature and the initial findings of Dell’Acqua et al. (2023), the authors assume that LLMs can bridge specific knowledge gaps in particular and thus have a compensatory effect: people with initially lower OR capabilities could benefit more from the support provided by LLMs than people with already high OR capabilities. Based on these considerations, the following hypotheses can be formulated:
Individuals with below-average OR capability benefit more from the use of LLMs than individuals with above-average OR capability.
Nevertheless, it should be noted that LLM tools like ChatGPT still involve text-based and chat-centric interactions between humans and AI. This implies that the quality of the output depends on the human input and is, therefore, significantly influenced by the user's prompt engineering skills (Korzynski et al., 2023). While some studies suggest that, depending on the context and in less complex tasks, such as brainstorming, no significant effect of individuals' LLM skills on performance should be expected (Reynolds & McDonell), we argue that in our context these skills are likely to play a role. Consequently, the effective exploitation of LLM potential in OR is expected to be linked to the ability to use these tools proficiently, representing an additional influencing factor. Based on these considerations, the following hypothesis can be formulated:
Individuals' LLM-skills increase their OR capabilities when using LLMs.
To sum up, Figure 1 illustrates the argumentation chain and the theoretical foundations supporting the hypotheses.
The flowchart starts with a first rectangle on the left titled “L L M Literature” that lists various authors, which are as follows: “Freire et al. (2023),” “Taddy (2019),” “Xie et al. (2021),” “Corpley (2023),” “Cullen-Knox et al (2017),” “Alshater (2022),” “Tang et al. (2012),” “Teubner et al. (2023),” and “Corpley (2023).” The text below the first rectangle reads “Main literature arguing for the impact of L L Ms on OR pillars.” Six rightward arrows from the first rectangle lead to the second rectangle consisting of six text boxes arranged vertically and labeled from top to bottom as follows: “Prior knowledge,” “Social capital,” “Cognition,” “Environmental conditions,” “Enterprise alertness,” and “Systemic search.” The text below the second rectangle reads “Influencing factors on OR - Pillars.” Rightward arrows from the second rectangle lead to a third rectangle on the far right labeled “Opportunity recognition.” The second and third rectangles are titled “George et al. (2016) based on evaluation of 180 articles.” A curly bracket encloses the three rectangles and shows a key below, which is as follows: H 1: The utilization of L L M increases the OR capability of individuals. H 2: Individuals with below-average OR capability benefit more from the use of L L Ms than individuals with above-average OR capability. H 3: Individuals’ L L M-skills increase their OR capabilities when using L L Ms.Theoretical foundation based on literature. Figure created by authors based on literature
The flowchart starts with a first rectangle on the left titled “L L M Literature” that lists various authors, which are as follows: “Freire et al. (2023),” “Taddy (2019),” “Xie et al. (2021),” “Corpley (2023),” “Cullen-Knox et al (2017),” “Alshater (2022),” “Tang et al. (2012),” “Teubner et al. (2023),” and “Corpley (2023).” The text below the first rectangle reads “Main literature arguing for the impact of L L Ms on OR pillars.” Six rightward arrows from the first rectangle lead to the second rectangle consisting of six text boxes arranged vertically and labeled from top to bottom as follows: “Prior knowledge,” “Social capital,” “Cognition,” “Environmental conditions,” “Enterprise alertness,” and “Systemic search.” The text below the second rectangle reads “Influencing factors on OR - Pillars.” Rightward arrows from the second rectangle lead to a third rectangle on the far right labeled “Opportunity recognition.” The second and third rectangles are titled “George et al. (2016) based on evaluation of 180 articles.” A curly bracket encloses the three rectangles and shows a key below, which is as follows: H 1: The utilization of L L M increases the OR capability of individuals. H 2: Individuals with below-average OR capability benefit more from the use of L L Ms than individuals with above-average OR capability. H 3: Individuals’ L L M-skills increase their OR capabilities when using L L Ms.Theoretical foundation based on literature. Figure created by authors based on literature
3. Method and materials
3.1 Data collection procedures
In the study, an experimental approach to measure the influence of LLM utilization on individuals' OR capabilities was employed. For this purpose, an online test was created, where participants were randomly divided into two groups on the first page by an automatic randomizer integrated into the survey tool, with the groups differing primarily in terms of the tools allowed.
After participation, the anonymized ideas were then presented in random order to a panel of experts via an online survey. The experts evaluated the ideas individually in a blind review process based on a predefined assessment matrix. Their evaluations were used to determine each participant's final OR capability score with and without LLM utilization.
3.2 Experimental design
After randomly assigning the participants to either Group A (experimental group) or Group B (control group), at a time point of t = 0, participants in both groups were asked to complete a survey (see Figure 2).
The figure shows a circle on the left labeled “PARTICIPANTS (N equals 63).” Two rightward arrows from “PARTICIPANTS (N equals 63)” lead to two dashed rectangles labeled “Experiment Group” and “Control Group.” Each rectangle is divided into five sections labeled from left to right as follows: “t equals 0,” “t equals 1,” “t equals 2,” “t equals 3,” and “t equals 4.” In the first rectangle at “t equals 0,” a circle is labeled “Group A (N equals 13).” At “t equals 1,” three text boxes are arranged vertically and labeled from top to bottom as follows: “Experiment 1,” “Task 1,” and “Task 2.” Two arrows from “Experiment 1” lead to two text boxes in “t equals 2” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.” A rightward arrow in “t equals 3” labeled “Manipulation: Usage of L L M” leads to three text boxes arranged vertically and labeled from top to bottom as follows: “Experiment 2,” “Task 2,” and “Task 1.” Two arrows from “Experiment 2” lead to two text boxes in “t equals 4” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.” In the second rectangle at “t equals 0,” a circle is labeled “Group B (N equals 20).” At “t equals 1,” three text boxes are arranged vertically and labeled from top to bottom as follows: “Experiment 1,” “Task 1,” and “Task 2.” Two arrows from “Experiment 1” lead to two text boxes in “t equals 2” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.” A rightward arrow in “t equals 3” labeled “No Manipulation: No usage of L L M” leads to three text boxes arranged vertically and labeled from top to bottom as follows: “Experiment 2,” “Task 2,” and “Task 1.” Two arrows from “Experiment 2” lead to two text boxes in “t equals 4” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.”Experimental design. Figure created by authors
The figure shows a circle on the left labeled “PARTICIPANTS (N equals 63).” Two rightward arrows from “PARTICIPANTS (N equals 63)” lead to two dashed rectangles labeled “Experiment Group” and “Control Group.” Each rectangle is divided into five sections labeled from left to right as follows: “t equals 0,” “t equals 1,” “t equals 2,” “t equals 3,” and “t equals 4.” In the first rectangle at “t equals 0,” a circle is labeled “Group A (N equals 13).” At “t equals 1,” three text boxes are arranged vertically and labeled from top to bottom as follows: “Experiment 1,” “Task 1,” and “Task 2.” Two arrows from “Experiment 1” lead to two text boxes in “t equals 2” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.” A rightward arrow in “t equals 3” labeled “Manipulation: Usage of L L M” leads to three text boxes arranged vertically and labeled from top to bottom as follows: “Experiment 2,” “Task 2,” and “Task 1.” Two arrows from “Experiment 2” lead to two text boxes in “t equals 4” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.” In the second rectangle at “t equals 0,” a circle is labeled “Group B (N equals 20).” At “t equals 1,” three text boxes are arranged vertically and labeled from top to bottom as follows: “Experiment 1,” “Task 1,” and “Task 2.” Two arrows from “Experiment 1” lead to two text boxes in “t equals 2” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.” A rightward arrow in “t equals 3” labeled “No Manipulation: No usage of L L M” leads to three text boxes arranged vertically and labeled from top to bottom as follows: “Experiment 2,” “Task 2,” and “Task 1.” Two arrows from “Experiment 2” lead to two text boxes in “t equals 4” labeled “Expert Evaluation: Opportunity Recognition capability” and “Self Evaluation: Opportunity Recognition capability.”Experimental design. Figure created by authors
In t = 1, a scenario was presented to both groups in written form. Again, the participants were randomly assigned to start with task 1 or task 2. Based on this given scenario, the participants were tasked to identify one profitable business opportunity within a given time. The identified ideas from round 1 were evaluated by experts using a blind review process based on a predefined assessment matrix (t = 2). Additionally, participants were asked to evaluate their own identified ideas individually, using the same matrix as the experts.
In t = 3, the participants are asked to take part in the second round of the experiment, in which people who received task 1 in round 1 now had to complete task 2 and people who received task 2 in the first round now had to complete task 1. The difference between Group A and B was that Group A was explicitly asked to use an LLM (ChatGPT 3.5) when completing their task in round 2, while within Group B, LLM utilization was forbidden. In t = 4, the participants were again asked to self-assess their business idea. Furthermore, the experts evaluated the identified opportunities, delivering the individual OR capability score from round 2. By comparing the OR capability score from rounds 1 and 2 from Group A, conclusions can be drawn about the effect of LLMs on individuals' OR capability.
3.3 Experiment task
To conduct the experiment, two different scenarios based on the SDGs were formulated. A sustainable scenario was chosen for the experiment, as identifying profitable business opportunities in a sustainable setting seems to be particularly challenging. Only 26% of start-ups focus on sustainability, highlighting a specific bottleneck in the recognition capabilities of sustainable opportunities (Kollmann et al., 2022). Since sustainable businesses must meet economic, social and environmental criteria (Dentchev et al., 2016), this adds complexity to the OR process and makes potential differences in performance for the experiment more visible. When formulating the tasks, the wording of a study by Vandor and Franke (2016) was followed, which has a similar experimental setup but with different manipulators.
Task 1: We know that in modern societies, minorities often go unnoticed.
An entrepreneur wants to launch a new product/service in Germany that promotes inclusion. It should clearly stand out from existing offers and appeal to many customers. Please make a proposal for an innovative and realizable product or service with which the entrepreneur can make a profit.
Task 2: We know that modern cities produce many types of waste.
An entrepreneur wants to launch a new product/service in Germany based on the reuse of waste materials. It should clearly stand out from existing offers and appeal to many customers. Please make a proposal for an innovative and feasible product or service with which the entrepreneur can make a profit.
3.4 Measurement items
Besides some questions regarding the demographic background, all items were based on existing constructs from the literature and were measured with a 7-point Likert Scale. To measure LLM skills, seven items from Strzelecki (2024) were utilized that assess “facilitating conditions”, “habits” and “behavior” in the context of ChatGPT use, such as “Using ChatGPT has become natural for me.” As a sustainable scenario for the experiment was chosen, individual Attitudes towards Sustainability (AtS) were also surveyed to check for potential biases that could affect performance within this specific scenario. To measure AtS, Biasutti and Frate's (2017) 15-item scale was used, aligned with the SDGs, covering economic, environmental and societal perspectives. For example, an economic item is, “Government economic policies should increase sustainable production even if it means spending more money.”
Instead of relying on self-assessment to measure OR, as commonly seen in the literature (Kuckertz et al., 2017), the authors implemented a task-based approach, as the effects of tools (here: LLMs) are only measurable in their application. The validity of a task-based measurement approach to measure OR capability is supported by Vandor and Franke (2016), who demonstrated that the quality of opportunities identified in an experimental setting also correlates with the quantity of opportunities individuals recognized in their general lives. To control for this relation in the experiment as well, the total number of ideas participants identified over their lifetime was assessed, using a single question: “How many business ideas have you identified in your life so far?”
The experts utilized four measurement items developed by Davidsson et al. (2021) to evaluate individuals' OR capabilities. One example of the items is: “Someone could turn this idea into a successful business.”
3.5 Sample
Before starting the main study, the authors first conducted a pre-test with 28 participants to quantify the general validity of the experiment. Then, graduate students from a wide range of educational institutions and study fields in Germany were asked to participate in the study. In this way, the authors aimed to provide a holistic view of the impact of LLM on OR capabilities and reduce the risk of pre-selection (e.g. through technology-affine students). The authors examined the dataset for anomalies or missing values and had to remove five participants. For example, the data of one participant was excluded who explicitly stated in the response section that he or she had already worked on the idea before. This suggested that the idea had not been generated or identified as part of the experiment. After cleaning, a total number of 630 data points (ratings of business ideas) remained for analysis. These results are from 63 participants. Each participant had to identify two business ideas based on two different scenarios, which were then evaluated by five independent experts. The research sample comprised of participants with a diverse set of demographics (Table 1).
Demographics and descriptive statistics
| Variables | Categories | Total amount | Share [%] | Experimental group | Control group |
|---|---|---|---|---|---|
| Gender | Male | 41 | 65.1 | 27 (62.8%) | 14 (70%) |
| Female | 21 | 33.3 | 15 (34.9%) | 6 (30%) | |
| Diverse | 1 | 1.6 | 1 (2.3%) | 0 | |
| Age | Min | 21 | 20 | ||
| Max | 40 | 36 | |||
| Mean | 25 | 25.5 | |||
| Degree | Bachelor | 14 | 21.9 | 6 (14%) | 8 (40%) |
| Master | 48 | 75 | 37 (86%) | 11 (55%) | |
| Diploma | 1 | 1.6 | 0 | 1 (5%) | |
| Field of study | Natural Sciences | 2 | 3.1 | 0 | 2 (10%) |
| Engineering | 18 | 28.1 | 12, (27.9%) | 6 (30%) | |
| Social Science | 2 | 3.1 | 2 (4.7%) | 0 | |
| Economics | 32 | 50 | 24 (55.8%) | 8 (40%) | |
| others | 10 | 15.6 | 5 (11.6%) | 4 (20%) | |
| Sample size | 63 | 43 | 20 |
| Variables | Categories | Total amount | Share [%] | Experimental group | Control group |
|---|---|---|---|---|---|
| Gender | Male | 41 | 65.1 | 27 (62.8%) | 14 (70%) |
| Female | 21 | 33.3 | 15 (34.9%) | 6 (30%) | |
| Diverse | 1 | 1.6 | 1 (2.3%) | 0 | |
| Age | Min | 21 | 20 | ||
| Max | 40 | 36 | |||
| Mean | 25 | 25.5 | |||
| Degree | Bachelor | 14 | 21.9 | 6 (14%) | 8 (40%) |
| Master | 48 | 75 | 37 (86%) | 11 (55%) | |
| Diploma | 1 | 1.6 | 0 | 1 (5%) | |
| Field of study | Natural Sciences | 2 | 3.1 | 0 | 2 (10%) |
| Engineering | 18 | 28.1 | 12, (27.9%) | 6 (30%) | |
| Social Science | 2 | 3.1 | 2 (4.7%) | 0 | |
| Economics | 32 | 50 | 24 (55.8%) | 8 (40%) | |
| others | 10 | 15.6 | 5 (11.6%) | 4 (20%) | |
| Sample size | 63 | 43 | 20 |
In terms of demographic and educational background, the experimental and control groups show a balanced distribution, highlighting the similarity between the two groups. Each group has a mix of genders, with males making up a significant portion of both groups (62.8% in the experimental group and 70% in the control group). The age range is very similar for both groups, with participants in the experimental group ranging from 21 to 40 years old (mean = 25), and in the control group from 20 to 36 years old (mean = 25.5). Engineering and Economics are the most common fields of study, comprising 50 and 28.1% of the total sample. This consistency in demographics underscores the comparability of the two groups, enhancing the credibility of the study's subsequent analyses and conclusions.
As described above, a panel of experts was set up to evaluate the ideas. To do so, experts were evaluated by surveying 20 candidates, five of whom met the criteria and formed the expert panel. The selection criteria included a central focus on assessing business ideas as part of professional responsibilities, financial decision-making authority and self-identification as experts in business evaluation. Candidates rated their qualifications on a 7-point Likert scale, and only those scoring 5 or higher were included. All selected experts had at least five years of relevant experience; three had backgrounds in consulting, while two had expertise in IT and finance. The panel was predominantly male (four out of five), with three members aged 30–40 and two aged 40–50, ensuring a highly experienced and competent group.
4. Findings
4.1 Data validity
After confirming the internal consistency of the multi-item measures (LLM skills: Cronbach's α = 0.86; AtS: Cronbach's α = 0.82), the analysis tested whether the general assumption proposed by Vandor and Franke (2016) - that the number of business ideas individuals identify over their lifetime correlates with their performance–also holds true in this sample. Confirming this assumption would provide fundamental support for the validity of the task-based measurement method. As shown in the correlation matrix (Table 2), this was indeed the case, with a strong positive correlation (r = 0.730, p < 0.001).
Correlation matrix
| Age | Degree | Gender | Field of study | General ideas over lifetime | LLM_Skill | AtS_Score | ORC_EGCG noGPT_SA | ORC_EG withGPT_SA | ORC_EGCG noGPT_Exp | ORC_EG WithGPT_Exp | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | Pearson correlation | – | ||||||||||
| Degree | Pearson correlation | 0.241 | – | |||||||||
| Sig. (2-sided) | 0.145 | |||||||||||
| Gender | Pearson correlation | −0.285 | −0.315 | – | ||||||||
| Sig. (2-sided) | 0.083 | 0.054 | ||||||||||
| Field of Study | Pearson correlation | −0.139 | 0.472** | 0.031 | – | |||||||
| Sig. (2-sided) | 0.404 | 0.003 | 0.854 | |||||||||
| General Ideas over Lifetime | Pearson correlation | 0.029 | −0.416** | 0.013 | −0.274 | – | ||||||
| Sig. (2-sided) | 0.864 | 0.009 | 0.940 | 0.096 | ||||||||
| LLM Skill | Pearson correlation | 0.059 | 0.197 | −0.085 | −0.057 | 0.215 | – | |||||
| Sig. (2-sided) | 0.723 | 0.236 | 0.612 | 0.734 | 0.195 | |||||||
| AtS Score | Pearson correlation | 0.107 | −0.057 | 0.230 | −0.036 | −0.146 | −0.469** | – | ||||
| Sig. (2-sided) | 0.522 | 0.732 | 0.165 | 0.832 | 0.383 | 0.003 | ||||||
| ORC_EGCG noGPT_SA | Pearson correlation | 0.281 | −0.063 | −0.142 | −0.133 | 0.230 | 0.094 | 0.224 | – | |||
| Sig. (2-sided) | 0.088 | 0.706 | 0.396 | 0.425 | 0.164 | 0.575 | 0.177 | |||||
| ORC_EG withGPT_SA | Pearson correlation | −0.123 | 0.023 | 0.092 | 0.101 | 0.158 | 0.374* | −0.220 | 0.105 | – | ||
| Sig. (2-sided) | 0.462 | 0.891 | 0.584 | 0.545 | 0.344 | 0.021 | 0.185 | 0.532 | ||||
| ORC_EGCG noGPT_Exp | Pearson correlation | 0.052 | −0.348* | −0.128 | −0.271 | 0.730** | −0.021 | 0.088 | 0.061 | −0.096 | – | |
| Sig. (2-sided) | 0.757 | 0.032 | 0.443 | 0.100 | <0.001 | 0.899 | 0.601 | 0.714 | 0.567 | |||
| ORC_EG WithGPT_Exp | Pearson correlation | −0.193 | 0.060 | 0.133 | 0.142 | 0.175 | 0.157 | −0.531** | −0.154 | 0.041 | 0.199 | – |
| Sig. (2-sided) | 0.246 | 0.720 | 0.425 | 0.396 | 0.295 | 0.346 | <0.001 | 0.357 | 0.805 | 0.231 | ||
| Age | Degree | Gender | Field of study | General ideas over lifetime | LLM_Skill | AtS_Score | ORC_EGCG noGPT_SA | ORC_EG withGPT_SA | ORC_EGCG noGPT_Exp | ORC_EG | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | Pearson correlation | – | ||||||||||
| Degree | Pearson correlation | 0.241 | – | |||||||||
| Sig. (2-sided) | 0.145 | |||||||||||
| Gender | Pearson correlation | −0.285 | −0.315 | – | ||||||||
| Sig. (2-sided) | 0.083 | 0.054 | ||||||||||
| Field of Study | Pearson correlation | −0.139 | 0.472** | 0.031 | – | |||||||
| Sig. (2-sided) | 0.404 | 0.003 | 0.854 | |||||||||
| General Ideas over Lifetime | Pearson correlation | 0.029 | −0.416** | 0.013 | −0.274 | – | ||||||
| Sig. (2-sided) | 0.864 | 0.009 | 0.940 | 0.096 | ||||||||
| LLM Skill | Pearson correlation | 0.059 | 0.197 | −0.085 | −0.057 | 0.215 | – | |||||
| Sig. (2-sided) | 0.723 | 0.236 | 0.612 | 0.734 | 0.195 | |||||||
| AtS Score | Pearson correlation | 0.107 | −0.057 | 0.230 | −0.036 | −0.146 | −0.469** | – | ||||
| Sig. (2-sided) | 0.522 | 0.732 | 0.165 | 0.832 | 0.383 | 0.003 | ||||||
| ORC_EGCG noGPT_SA | Pearson correlation | 0.281 | −0.063 | −0.142 | −0.133 | 0.230 | 0.094 | 0.224 | – | |||
| Sig. (2-sided) | 0.088 | 0.706 | 0.396 | 0.425 | 0.164 | 0.575 | 0.177 | |||||
| ORC_EG withGPT_SA | Pearson correlation | −0.123 | 0.023 | 0.092 | 0.101 | 0.158 | 0.374* | −0.220 | 0.105 | – | ||
| Sig. (2-sided) | 0.462 | 0.891 | 0.584 | 0.545 | 0.344 | 0.021 | 0.185 | 0.532 | ||||
| ORC_EGCG noGPT_Exp | Pearson correlation | 0.052 | −0.348* | −0.128 | −0.271 | 0.730** | −0.021 | 0.088 | 0.061 | −0.096 | – | |
| Sig. (2-sided) | 0.757 | 0.032 | 0.443 | 0.100 | <0.001 | 0.899 | 0.601 | 0.714 | 0.567 | |||
| ORC_EG | Pearson correlation | −0.193 | 0.060 | 0.133 | 0.142 | 0.175 | 0.157 | −0.531** | −0.154 | 0.041 | 0.199 | – |
| Sig. (2-sided) | 0.246 | 0.720 | 0.425 | 0.396 | 0.295 | 0.346 | <0.001 | 0.357 | 0.805 | 0.231 | ||
Note(s): Measurement Items:
AtS Score: Attitude towards Sustainability
ORC_EGCG_noGPT_SA: Opportunity Recognition capability of Experimental Group and Control Group in task without LLM, support measured by Self Evaluation
ORC_EG_withGPT_SA: Opportunity Recognition capability of Experimental Group in task with LLM support, measured by Self Evaluation
ORC_EGCG_noGPT_Exp: Opportunity Recognition capability of Experimental Group and Control Group in task without LLM support, measured by Expert Evaluation
ORC_EG_withGPT_Exp: Opportunity Recognition capability of Experimental Group in task with LLM support, measured by Expert Evaluation
**. The correlation is significant at a level of 0.01 (2-sided)
*. The correlation is significant at a level of 0.05 (2-sided)
After checking a balanced sample and the general validity of the task-based measurement approach, other possible bias factors and the experimental setups' basic stability must be proven. Therefore, multiple tests were conducted:
First, an a priori power analysis using G-Power was conducted, which recommended a minimum of 35 participants (effect size = 0.8, alpha = 0.05, power = 0.95). With 43 participants in the experimental group, this threshold was exceeded, ensuring adequate power to test the hypotheses.
To ensure no significant differences between the experimental and control groups regarding their initial capability, two t-tests were conducted. The first t-test compared the performance of the experimental subgroup that initially completed task 1 without LLM utilization with the control subgroup that also completed task 1. The same test was done with task 2.
Next, it was tested whether general learning effects might occur between rounds 1 and 2, as this could weaken the ability to attribute any observed improvements to the use of LLMs. To exclude such potential learning effects, the control group was used for comparison (t-test). The authors specifically checked whether participants generally performed better in the second round than in the first.
Finally, due to the risk that tasks 1 and 2 are fundamentally different in terms of difficulty, in the experimental setup, the order was varied randomly. Additionally, a final t-test was conducted to examine whether task 1 and task 2 differed in difficulty. The results of task 1 completed without LLM utilization were compared with the results of task 2 completed without LLM utilization.
In all cases, the tests revealed no significant differences. Therefore, it was concluded that the groups were comparable in terms of initial capability, that no general learning effects occurred between the rounds, and that task 1 and task 2 were of similar difficulty.
4.2 Impact of LLMs on opportunity recognition capabilities
To verify the main hypothesis on the effect of LLM on OR capabilities, a t-test was conducted (Experimental Group). The authors checked to which individuals' OR capability score in the test without LLM utilization differs from their OR capability score when using LLMs. To reduce the risk of bias due to the students' self-perception, the experts' assessment of the OR capabilities was referred to first. The results of the paired t-test show a significant difference between the participants' performance in identifying business ideas with and without LLM utilization. The mean difference in performance was −1.884, with a standard deviation of 1.264 and a standard error of the mean of 0.193. The 95% confidence interval of this difference was between −2.273 and −1.495, indicating that participants performed significantly better with LLM utilization than without. The t-value for this difference is −9.771, with 42 degrees of freedom, resulting in a one-sided p-value of <0.001 and a two-sided p-value of <0.001. These statistical results highlight the significant improvement in participants' performance in recognizing profitable business ideas with the support of LLMs. Effect sizes were calculated using Cohen's d and Hedges' correction to assess the strength of the effect of LLM support on performance. Cohen's d yielded a value of 1.264, with a 95% confidence interval between −1.490 and −1.922. Hedges' correction, which applies a correction factor for small sample sizes, yielded a slightly higher value of 1.287, with a 95% confidence interval between −1.463 and −1.887. These effect sizes indicate a strong positive effect of LLM utilization on students' performance in identifying business ideas, corresponding to a substantial improvement in OR capability and potentially increasing the likelihood of successfully identifying and pursuing viable ventures.
As visualized in Figure 3, and based on the experts' assessment, using LLMs led to a significantly higher OR capability score among the participants. Accordingly, H1 is supported by the data. In the second step, it was investigated how the participants self-assessed their OR capabilities with and without LLM utilization. In contrast to the expert assessment, there was no significant difference here, which was verified using a t-test. Accordingly, the participants did not recognize any significant improvement, even if this was the case regarding the independent and anonymized expert assessment.
The horizontal axis has markings labeled from left to right as follows: “Opportunity Recognition without L L M Expert Evaluation,” “Opportunity Recognition with L L M Expert Evaluation,” “Opportunity Recognition without L L M Self Evaluation,” and “Opportunity Recognition with L L M Self-Evaluation.” The vertical axis has markings ranging from 1.00 to 7.00 in increments of 1.00 units. The data from the bars on the graph is as follows: Opportunity Recognition without L L M Expert Evaluation: Minimum: 1.23; Lower Quartile: 2.16; Median: 2.85; Upper Quartile: 3.73; Maximum: 4.74. Opportunity Recognition with L L M Expert Evaluation: Minimum: 3.07; Lower Quartile: 4.14; Median: 4.9; Upper Quartile: 5.46; Maximum: 6.19. Opportunity Recognition without L L M Self Evaluation: Minimum: 2.02; Lower Quartile: 4.6; Median: 5.5; Upper Quartile: 6.37; Maximum: 6.98. Opportunity Recognition with L L M Self-Evaluation: Minimum: 3.28; Lower Quartile: 4.87; Median: 5.5; Upper Quartile: 6.24; Maximum: 6.98. Note: All numerical data values are approximated.OR capability with and without LLM based on expert evaluation and Self-evaluation. Figure created by authors based on collected data
The horizontal axis has markings labeled from left to right as follows: “Opportunity Recognition without L L M Expert Evaluation,” “Opportunity Recognition with L L M Expert Evaluation,” “Opportunity Recognition without L L M Self Evaluation,” and “Opportunity Recognition with L L M Self-Evaluation.” The vertical axis has markings ranging from 1.00 to 7.00 in increments of 1.00 units. The data from the bars on the graph is as follows: Opportunity Recognition without L L M Expert Evaluation: Minimum: 1.23; Lower Quartile: 2.16; Median: 2.85; Upper Quartile: 3.73; Maximum: 4.74. Opportunity Recognition with L L M Expert Evaluation: Minimum: 3.07; Lower Quartile: 4.14; Median: 4.9; Upper Quartile: 5.46; Maximum: 6.19. Opportunity Recognition without L L M Self Evaluation: Minimum: 2.02; Lower Quartile: 4.6; Median: 5.5; Upper Quartile: 6.37; Maximum: 6.98. Opportunity Recognition with L L M Self-Evaluation: Minimum: 3.28; Lower Quartile: 4.87; Median: 5.5; Upper Quartile: 6.24; Maximum: 6.98. Note: All numerical data values are approximated.OR capability with and without LLM based on expert evaluation and Self-evaluation. Figure created by authors based on collected data
A comparison of the mean values of the experts' ratings (mean = 2.97) and the participants' self-evaluation (mean = 5.33) for the task without LLM utilization shows an average difference of 2.36 points on the rating scale of 1–7 (Figure 4).
The horizontal axis has markings labeled from left to right as follows: “Opportunity Recognition without L L M Expert Evaluation,” “Opportunity Recognition without L L M Self Evaluation,” “Opportunity Recognition with L L M Expert Evaluation,” and “Opportunity Recognition with L L M Self-Evaluation.” The vertical axis has markings ranging from 1.00 to 7.00 in increments of 1.00 units. The data from the bars on the graph is as follows: Opportunity Recognition without L L M Expert Evaluation: Minimum: 1.25; Lower Quartile: 2.17; Median: 2.85; Upper Quartile: 3.73; Maximum: 4.73. Opportunity Recognition without L L M Self Evaluation: Minimum: 2.01; Lower Quartile: 4.62; Median: 5.49; Upper Quartile: 6.34; Maximum: 6.98. Opportunity Recognition with L L M Expert Evaluation: Minimum: 3.06; Lower Quartile: 4.14; Median: 4.89; Upper Quartile: 5.47; Maximum: 6.18. Opportunity Recognition with L L M Self-Evaluation: Minimum: 3.29; Lower Quartile: 4.85; Median: 5.51; Upper Quartile: 6.22; Maximum: 6.98. A vertical double-headed arrow between the median of the first and second box plots indicates a delta mean of 2.36. A vertical double-headed arrow between the median of the third and fourth box plots indicates a delta mean of 0.59. Note: All numerical data values are approximated.Comparison of expert evaluation and self-evaluation of OR capabilities. Figure created by authors based on collected data
The horizontal axis has markings labeled from left to right as follows: “Opportunity Recognition without L L M Expert Evaluation,” “Opportunity Recognition without L L M Self Evaluation,” “Opportunity Recognition with L L M Expert Evaluation,” and “Opportunity Recognition with L L M Self-Evaluation.” The vertical axis has markings ranging from 1.00 to 7.00 in increments of 1.00 units. The data from the bars on the graph is as follows: Opportunity Recognition without L L M Expert Evaluation: Minimum: 1.25; Lower Quartile: 2.17; Median: 2.85; Upper Quartile: 3.73; Maximum: 4.73. Opportunity Recognition without L L M Self Evaluation: Minimum: 2.01; Lower Quartile: 4.62; Median: 5.49; Upper Quartile: 6.34; Maximum: 6.98. Opportunity Recognition with L L M Expert Evaluation: Minimum: 3.06; Lower Quartile: 4.14; Median: 4.89; Upper Quartile: 5.47; Maximum: 6.18. Opportunity Recognition with L L M Self-Evaluation: Minimum: 3.29; Lower Quartile: 4.85; Median: 5.51; Upper Quartile: 6.22; Maximum: 6.98. A vertical double-headed arrow between the median of the first and second box plots indicates a delta mean of 2.36. A vertical double-headed arrow between the median of the third and fourth box plots indicates a delta mean of 0.59. Note: All numerical data values are approximated.Comparison of expert evaluation and self-evaluation of OR capabilities. Figure created by authors based on collected data
In the context of the evaluation of the performance in the task with LLMs, the discrepancy between the average experts' evaluation (mean = 4.85) and the evaluation by the participants themselves (mean = 5.44) is smaller (Δ mean = 0.59). This reflects a change in the rating difference between tasks with and without LLM utilization. Even assuming that the experts may be fundamentally stricter in their evaluation due to their experience, this is likely to be reflected in a fundamental deviation in the average score of the experts' evaluation and the participants' self-evaluation, but should be visible to the same deviation ratio in both tasks. As the deviation ratio changes (with LLM = Δ mean 0.59; without LLM = Δ mean = 2.36), a possible bias can be assumed, either in the form of participants overestimating their own ideas without LLM support or underestimating the improvement achievable through AI assistance.
Next, it was investigated whether all participants benefited equally from using LLMs and were therefore able to increase their OR capability equally. The authors analyzed the performance results of all participants and calculated the mean OR capability score based on the expert evaluations. Participants who scored below the median value of 2.6 (in the task without LLM utilization) were categorized as “low performers”, while those who scored above the median were categorized as “high performers”. Based on this categorization, a t-test was conducted to compare the average increase or decrease in performance (between the task without LLM utilization and the task with LLM utilization) depending on the participant's performance group (low or high performer).
The results (Figure 5) show that the group with the initially lower performance without using LLMs (Group 1) achieved an average improvement of 2.58 points (SD = 1.10297), while the group with the initially higher performance in the task without LLM utilization (Group 2) recorded a lower average improvement of 1.42 points (SD = 1.16518).
The horizontal axis has two markings for two groups labeled “Low performer group (Group 1) OR capability without L L M less than mean (2.6)” and “High performer (Group 2) OR capability with L L M greater than mean (2.6).” The vertical axis is labeled “OR capability Score - Improvement through L L M usage” and has markings ranging from 0.0 to 4.0 in increments of 2.0 units. The low performer group has 17 bars. The high performer marking has 26 bars. In the low performer group, the tallest bar has a height of 4.66, and the shortest bar has a height of 0.63. The remaining bars lie between 0.63 and 4.66. In the high performer group, the tallest bar has a height of 3.36, and the shortest bar has a height of negative 1.2. Horizontal lines are drawn at 2.54 and 1.39. A curly line between these two lines indicates a delta mean of 1.16. For the low performer group, the delta mean is (2.58). For the high performer group, the delta mean is (1.42). Note: All numerical data values are approximated.Differences in improvement through the use of LLM depending on initial OR capabilities. Figure created by authors based on collected data
The horizontal axis has two markings for two groups labeled “Low performer group (Group 1) OR capability without L L M less than mean (2.6)” and “High performer (Group 2) OR capability with L L M greater than mean (2.6).” The vertical axis is labeled “OR capability Score - Improvement through L L M usage” and has markings ranging from 0.0 to 4.0 in increments of 2.0 units. The low performer group has 17 bars. The high performer marking has 26 bars. In the low performer group, the tallest bar has a height of 4.66, and the shortest bar has a height of 0.63. The remaining bars lie between 0.63 and 4.66. In the high performer group, the tallest bar has a height of 3.36, and the shortest bar has a height of negative 1.2. Horizontal lines are drawn at 2.54 and 1.39. A curly line between these two lines indicates a delta mean of 1.16. For the low performer group, the delta mean is (2.58). For the high performer group, the delta mean is (1.42). Note: All numerical data values are approximated.Differences in improvement through the use of LLM depending on initial OR capabilities. Figure created by authors based on collected data
The Levene test for equality of variances resulted in a p-value of 0.924, which indicates that the variances between the groups can be assumed equal. This led to applying the t-test for equality of means, assuming equal variances. The t-test revealed a significant difference in mean improvement between the two groups (t(41) = −3.246, p = 0.001, one-sided), with a mean difference of −1.15543 points. The 95% confidence interval for this difference is between −1.87435 and −0.43651 points, emphasizing the statistical significance of the difference. Furthermore, the calculated effect sizes show that this difference is significant. Cohen's d is 1.14131, which indicates a large effect. Hedges' g and Glass's delta, with values of 1.16273 and 1.10297, respectively, also confirm a strong effect of ChatGPT utilization on performance improvement, especially for participants with initially lower OR capabilities.
In summary, these results suggest that the use of LLMs led to a significantly greater improvement in performance for participants who initially performed worse without LLM utilization compared to those who were already performing well without LLM utilization. These findings suggest that LLMs could be particularly valuable for learners or users needing improvement by providing targeted support where it is most needed. Consequently, H2 is supported.
4.3 Impact of individuals' LLM-skills on their performance with LLMs
In addition, further statistical tests to investigate a possible moderation effect of individuals' LLM skills on their OR capability in the task solved with LLM utilization were conducted. The analysis included the independent variables LLM skills and prior performance without LLMs (ORC_EGCG_noGPT_Exp) and the interaction term between these variables.
The results showed a moderate correlation (R = 0.308) between the independent variables (LLM skills and prior performance without LLM utilization) and the dependent variable (performance with LLM utilization). However, the model explained only 9.5% of the variance in performance with LLM utilization (R-squared = 0.095), with the corrected R-squared of 0.049 indicating that the model is not a good fit to the data. When including the interaction term, the model still explained 9.5% of the variance in performance with LLMs (R-squared = 0.095), but the corrected R-squared dropped to 0.026, further indicating a poor fit.
To summarize, participants' self-rated LLM skills showed no significant correlation with their OR score when using LLM. In a regression including prior (non-LLM) performance, LLM skill and their interaction, neither the direct effect of skill nor the interaction term was significant (suggesting that even those skilled with LLMs did not outperform others in our ideation task). Accordingly, H3 is not supported by the data.
4.4 Control variables
Finally, the results were controlled for demographic factors and AtS, considering that the sustainability theme of the experiment might influence results. In the model without LLM utilization, none of the demographic or AtS variables had a statistically significant effect on OR capability. However, in the LLM-assisted model, AtS showed a significant negative relationship with OR capability (Beta = −0.580, p < 0.001), indicating that participants with higher AtS scores tended to have lower OR capability scores when using LLMs to solve their task. All other variables, including gender, field of study, age and degree, remained non-significant in both contexts.
5. Concluding discussion
Based on a controlled experiment with short-term tasks, this study confirms a significant positive effect of LLM utilization on individuals' OR capabilities. By varying task order and including a control group, other influences were ruled out, establishing LLMs as promising tools for enhancing the OR capability of individuals. This finding builds on prior research (Giones and Brem, 2017; Davidson and Vaast, 2010) by showing that LLMs, such as ChatGPT, can act not merely as an “input”, but also as an “enabling” factor in entrepreneurship, enhancing OR capabilities. Furthermore, low-performing students showed greater improvement with LLMs, supporting the democratization potential of AI as discussed by Kaplan and Haenlein (2019).
Contrary to expectations, LLM skills did not significantly affect individuals' performance in the experiment. While prior studies highlight the benefits of prompt engineering in specialized tasks (Korzynski et al., 2023), the findings of this study suggest it may be less critical for general ideation. This aligns with Reynolds and McDonell (2021), who found prompt engineering more impactful in complex tasks than in brainstorming. Further research on individual-specific LLM user behavior could clarify success factors and the link between LLM skills and output quality in OR contexts.
Also, demographic factors and AtS were analyzed. While demographics had no effect, a negative correlation between AtS and OR capability within the task of LLM utilization was found, possibly due to sustainability-oriented participants prioritizing ethical over profitable ideas. This difference highlights a potential mismatch between investor-driven perspectives prioritizing high returns (Guo and Wang, 2021) and sustainability-focused entrepreneurs (Dentchev et al., 2016) with high AtS, potentially influencing entrepreneurial decision-making.
Finally, comparing self-assessment with expert evaluations revealed no correlation, suggesting a potential self-assessment bias. Ratings aligned more closely in LLM-assisted tasks (mean difference = 0.59) than in the task solved without LLMs (mean difference = 2.36), possibly due to participants' emotional attachment to self-generated ideas or Not-Invented-Here (NIH) bias against AI-assisted ideas (Antons et al., 2017). This discrepancy calls into question the reliability of self-assessment (Brenner and DeLamater, 2016) evaluations of complex constructs like OR. It further supports a shift towards more objective assessment methods and highlights task-based measures, as used in the study, as a potentially more objective approach in this context.
6. Implications and contributions
6.1 Theoretical implications and contributions
To conclude, this study is among the first empirical investigations to demonstrate the impact of LLMs on a key entrepreneurial skill–OR. Conducted as a controlled experiment in the early ideation phase of venture creation, it provides concrete, data-based evidence that LLMs can enhance OR (H1).
Firstly, the study's findings question the traditional understanding of OR as an essential, innate capability of the entrepreneur. Historically, OR has been seen as a core entrepreneurial function, reliant on individual skills, knowledge and unique cognitive abilities (Shane and Venkataraman, 2000). However, now having the proof-of-concept that LLMs can act as cognitive automation tools by compensating for these capabilities (H1), OR may no longer be seen as an exclusively personal factor. Instead, OR can be augmented through AI, implying that a well-designed digital tool–in line with the concept of technology affordances (Faraj and Azad, 2012) - can democratize access for individuals lacking traditional entrepreneurial skills and thereby broaden the range of individuals that can potentially succeed in identifying viable business ideas. This insight calls into question the long-held view that OR is a unique capability and at the same time, touches the heart of human capital theory in entrepreneurship by providing evidence that OR skills can be enhanced externally.
Additionally, the study suggests that AI, specifically LLMs, may reduce the traditional reliance on human resources and social networks within the entrepreneurial process. As LLMs provide access to vast knowledge and simulate cognitive networking functions, the importance of human social capital–historically critical for decision-making and information gathering in entrepreneurship–may diminish. This shift has significant implications for entrepreneurial theory, which often emphasizes the value of networks and social constructs in OR and decision-making (Ardichvili et al., 2003; Cooper and Park, 2008). With LLMs capable of delivering extensive knowledge and simulating the brainstorming and feedback typically found in human networks, the role of social connections in entrepreneurship could be redefined, particularly in the early stages of idea development and validation.
Furthermore, the study highlights a shift in the entrepreneur's role from a “creator of opportunities” to an “enabler of opportunities” (Ramoglou and McMullen, 2024). LLMs strengthen this facilitator role by assisting entrepreneurs in recognizing latent opportunities. This aligns with the perspective that digital technologies not only shape opportunities but also influence entrepreneurs' behavior and their working processes (Corvello, 2022), adding a new facet to the ongoing debate on the theoretical nature of OR, bridging elements of both the creation and discovery views.
In summary, the study extends the theoretical discourse on the role of digital technology in entrepreneurship by focusing specifically on LLMs. The findings suggest that LLMs not only enhance traditional entrepreneurial functions but also have the potential to reshape foundational concepts within the field, including OR and the role of entrepreneurs. This opens new avenues for research on how AI might redefine the skills, resources and team dynamics essential for successful entrepreneurship in the digital age, thereby transforming traditional pathways and redefining the landscape of digital entrepreneurship.
6.2 Practical implications and contributions
The results of the study show that the integration of LLM in entrepreneurship programs could be helpful and can support students in identifying profitable and sustainable business ideas. Due to the varying effectiveness depending on the performance groups, the authors recommend integrating LLMs as a tool for generating business ideas into study programs at an early stage. This is because people with limited experience and knowledge could benefit disproportionately from using LLMs to identify business opportunities (H2). In this way, LLMs can help generate ideas and thus facilitate students' entry into entrepreneurial activities.
The findings further offer valuable insights for decision-makers in the educational sector, particularly in the ongoing debate surrounding the permitted use of LLMs among students and within academic institutions. While many educational institutions remain critical of incorporating LLMs, the study advocates for a more nuanced perspective. Rather than focusing solely on whether LLM usage should be allowed or prohibited, the authors suggest shifting the conversation towards educating students on the appropriate and practical application of LLMs. By equipping students with guidance on the meaningful use of these tools, educational institutions can enhance learning outcomes and empower students to use LLMs as valuable aids for idea generation, critical thinking and problem-solving - skills essential for their future careers (e.g. as outlined in the EU's Digital Education Action Plan). One practical way to introduce students to the use of LLMs in OR would be to replicate the experimental setup of this study by providing a clearly defined scenario as a starting point, enabling students to employ LLMs to develop their own business ideas.
While our study presents promising results, it is also important to acknowledge concerns raised in the literature about the potential long-term cognitive effects of relying on LLMs for OR like, e.g. decline in independent OR capabilities, or general AI-dependencies (Yu, 2023; Kasneci et al., 2023; Wang et al., 2023). Accordingly, these developments should also be monitored, with attention paid to a balanced and responsible integration of such advanced technology in education, thereby preparing students to navigate an increasingly AI-driven world.
For entrepreneurs themselves, the study's findings suggest that integrating LLMs into the idea generation process can be beneficial, as it helps uncover new perspectives and potentially identify business models that may have remained out of reach due to limited experience, lack of network access, or other constraints.
Lastly, from a methodological perspective, the research design (Figure 2), a randomized crossover experiment with a blinded expert panel, represents a relatively novel and rigorous approach in entrepreneurship research, which traditionally relies mostly on surveys. This design can serve as a blueprint for researchers, as it enables more objective measurements for complex cognitive constructs like OR.
7. Limitations and implications for further research
This study is subject to several limitations that warrant consideration. First, the findings are linked to the specific version of the LLM employed (ChatGPT 3.5). Given the rapid evolution of LLM technologies, subsequent model versions may yield different results, thereby limiting the generalizability of these conclusions across versions. Second, although an expert panel was utilized to assess the quality of the identified opportunities, the assumed objectivity of this evaluation is inherently limited. The aggregated ratings reflect a synthesis of individual subjective judgments, meaning that the “objective” assessment ultimately derives from the convergence of subjective perspectives. Third, a potential alternative explanation for the observed improvements in OR with LLM use may lie not in the superiority of the generated ideas themselves, but rather in their more refined and eloquent formulation. The enhanced linguistic quality provided by the LLM could have influenced expert evaluations, independent of the substantive quality of the underlying ideas. Moreover, the sample itself constitutes a limitation, as it consists exclusively of graduate students. The findings may therefore not be directly generalizable to experienced entrepreneurs. Lastly, the discussed democratizing potential of LLMs for entrepreneurship is contingent upon having reliable access to the underlying technology. Where such access is lacking, these tools may not only fail to close existing gaps but could even widen disparities between regions or populations with differing levels of technological infrastructure.
These limitations highlight important boundary conditions of the present findings. In combination with the results of this study, several promising avenues for future research can be identified, which are outlined in the following section.
Research on comparing different LLM architectures' efficiency: Which models perform best for entrepreneurial ideation?
With the rapid evolution of LLMs – moving from versions 3.0 to 4.0 and beyond – and the introduction of new architectures such as DeepSeek (Ng et al., 2025), a pressing question arises: to what extent do different LLM architectures vary in their effectiveness for ideation and OR? While various studies have already demonstrated differing capabilities of LLM models in multiple domains (Frieder et al., 2023; Rossettini et al., 2024; Hochmair et al., 2024), little research has specifically investigated their performance in business idea generation and OR. Future research could compare various LLM models to determine their efficiency in producing high-quality business ideas.
Research on the cognitive side effects of LLMs utilization on OR: The risk of cognitive dependence
While LLMs can enhance cognitive capabilities in the short term by facilitating idea generation and improving search efficiency, there is a risk that prolonged reliance on these models could lead to a decline in independent OR capabilities. Entrepreneurs who consistently depend on AI-driven insights may gradually lose the ability to identify opportunities without external assistance, potentially diminishing their entrepreneurial alertness and creativity over time. Therefore, a key avenue for future research is to conduct longitudinal studies that examine whether sustained LLM usage fosters cognitive dependence.
Research on the development of teaching concepts/programs for LLM utilization for Ideation
By demonstrating the overall positive effect of LLM utilization in OR within this study, there is a growing need to develop evidence-based teaching concepts and educational programs that equip entrepreneurs with the skills to effectively leverage these tools for ideation. Future research should focus on designing, implementing and evaluating training interventions that help users maximize the creative and strategic potential of LLMs in OR. Such work could investigate which pedagogical approaches – for example, experiential learning, case-based instruction, or simulation exercises – are most effective in fostering LLM literacy and enhancing entrepreneurial outcomes.
Research on the long-term economic impact of LLM-enhanced OR: From ideation to entrepreneurial action?
While existing research, including this study, has shown that LLMs can significantly improve OR, a crucial first step toward entrepreneurial activity, it remains unclear whether this improvement translates solely into higher-quality ideas or also leads to an increase in entrepreneurial action, such as the actual founding of new ventures. Future research should investigate whether LLM-driven enhancements in OR ultimately result in higher rates of firm creation and entrepreneurial engagement. This line of inquiry is essential to understand the broader economic implications of LLMs – namely, whether these technologies have the potential not only to enhance individual-level cognitive processes but also to stimulate entrepreneurial ecosystems and contribute to economic growth over time.
Ethical Statement
The anonymity of the survey ensures that data and participants cannot be matched. In combination with the voluntary nature of participation, no ethical review is therefore required.

