Skip to Main Content
Purpose

This paper examines how the scientific system, particularly the educational sciences, observes and communicates the phenomenon of generative artificial intelligence (AI) in education. Drawing on Luhmann’s social systems theory, the purpose is to reconstruct how scientific communication stabilizes truth by framing AI through distinctions such as augmentation, automation, and hybrid models.

Design/methodology/approach

A systematic literature review was conducted following the PRISMA 2020 framework, using Scopus to cover the period 2020–2025. Out of 1,105 initial publications, 110 studies met the inclusion criteria. Quality was assessed with the Mixed Methods Appraisal Tool (MMAT).

Findings

Educational sciences primarily frame generative AI as augmentation (71.8%), with automation (18.2%) and hybrid models (3.6%) secondary. Higher education dominates as the site of observation (65.5%), while primary and vocational education remain marginal. Ethical concerns focus on privacy, bias, accountability, and teacher autonomy. The scientific system reduces uncertainty by coding AI as supportive rather than disruptive.

Research limitations/implications

The review relied on Scopus, English-only full-text sources, and citation-based selection, which may bias findings.

Practical implications

The study enriches systematic review methodology by integrating systems theory, showing how reviews can serve not only as syntheses but also as tools for second-order observation of scientific communication.

Originality/value

By combining systematic review with systems theory, the study demonstrates how science programs its communication about AI, offering a transferable model for analysing emerging technologies in other social systems.

Science is a social system whose primary function is to communicate truth (Luhmann, 1989: p. 76). It operates more through its own processes and structures of knowledge than through the intentions of individual scientists. By adopting this anti-humanist stance (which focuses on communication systems rather than individuals), we treat science as a social system rather than as a space in which diverse and hardly reducible individuals act. Only communication communicates (Luhmann, 2002a, p. 169; Luhmann, 1995, p. 143), and in this sense science reproduces itself through its operations and through the programs (theories and methods) whose fundamental aim is the production of truth.

In its own distinctive way, this approach is also embraced by Popper, who regards the value and objectivity of science as residing in the constant acceptance and refutation of facts presented as truth. The content of truth is always subject to reconsideration, yet the pursuit of truth remains the core of the scientific process (Popper, 2003). Truth constitutes the basic code of the scientific system (Luhmann, 1989, p. 76; Roth, 2024), and it is realized through programs (methods and theories) that ensure both the form and transparency of the process itself (Brezovec et al., 2025). However, a key question arises here: can we empirically trace this process, or at least record its form at a given moment? Based on the assumptions outlined thus far, it is empirically possible to temporally register the form of truth concerning a given phenomenon. One way to achieve this is through the method of systematic review.

The systematic review plays an important role within the scientific system. Although it is largely subordinated to the methodological imperative (Ježovita, 2021) and often detached from theoretical reflection (Chamberlain, 2000, p. 290), the systematic review constitutes a form of exploration of the process of arriving at truth. It does so by recording the dominant narratives, themes, and methods through which we approach the explanation of phenomena that are salient within the scientific system during a given period. In this way, the systematic review reduces the complexity of research processes: it proceeds from the assumption that the totality of knowledge on a given topic cannot be encompassed but instead provides insight into the key aspects of a broad range of studies (Mulrow, 1994; Petticrew and Roberts, 2006). Or phrased in the Luhmann’s words: In a narrower sense, one should speak of a reduction in complexity if the framework of relations, forming a complex nexus, is reconstructed by a second nexus having fewer relations. Only complexity can reduce complexity (Luhmann, 1995, p. 26). Complexity of environment, reduced into meaning of a system.

Yet, the paradox of complexity reduction shows that by reducing complexity at one level, we inevitably increase it at another. This has also occurred with the method of systematic review itself, as evidenced by the growing number of studies devoted to explaining different methodological approaches to literature review (Pollock and Berge, 2017; Gough and Richardson, 2024). We thus arrive at a paradox in which a systematic review of systematic reviews (and so on ad infinitum) becomes necessary (Aromataris and Pearson, 2014). However, within the scientific system this paradox reaches its limit once theoretical purposiveness is ascribed to the systematic review. Only through theoretical reflection do the results of a systematic review acquire their full scientific significance. This paper is based precisely on that assumption: by moving away from the standard practices of systematic review, whose aim is typically to present the state of knowledge at a given moment, this study approaches systematic review as a method for demonstrating the autopoietic nature of the scientific system in the context of social systems theory.

Specifically, the focus is on the system of educational sciences and the ways in which it relates to the phenomenon of generative artificial intelligence and its application in education. What are the fundamental approaches, themes, and conclusions that the scientific system articulates regarding generative artificial intelligence in education? It is important to emphasize that by identifying these aspects we do not gain insight into the functioning of the education system itself (since it is a closed system), but rather into the ways in which the educational sciences, within the framework and observation of the scientific system, communicate the changes taking place in the field of education. In other words, this constitutes a second order observation of the second order of observation: We trace (observe) how the system of education sciences observes education system which observes and irritates itself by the generative AI technology.

So, this article pursues dual aim; it both introduces systematic review as a methodological instrument for conducting second order observation of scientific second-order observations and, simultaneously, reconstructs what these observations show about how the educational sciences observe and interpret generative AI in education.

The choice of the system of educational sciences and the topic of generative artificial intelligence in education arises from the very dynamics of scientific production. In recent years, many studies have been published on this subject, creating the impression of an intense production of knowledge, though without yet achieving stabilized dimensions of truth. Scientific communication on generative artificial intelligence in education is currently in a phase of high productivity but also epistemological fluidity, in which dominant narratives and methodological approaches are only beginning to take shape. For this very reason, this area represents a suitable field of analysis: the systematic review makes it possible to detect patterns within the abundance of communication, while theoretical reflection reveals how the scientific system, through its own operations, seeks to construct truth about this emerging social phenomenon. As Luhmann’s theory is highly technical in language, we found appropriate to summarize main theoretical key points used in this work. We drew on a few core ideas from Luhmann’s social systems theory in a minimal and pragmatic way. First, Luhmann understands modern society as differentiated into relatively autonomous social systems (such as science, education, law, politics), each operating with its own guiding code and programs. Second, social systems do not consist of individuals but of communications; individuals belong to the environment of these systems, even though they participate in them. Third, each system reduces the complexity of its environment by using programs that specify how its binary code is applied in concrete cases. Finally, science is characterized by the code true/false and by second-order observation: it observes how other social systems observe the world. So, our use of Luhmann is limited to this framework, which serves as a lens for analysing how educational sciences communicate about generative AI.

Luhmann’s theory provides us with a powerful analytical framework for understanding the system of science. Through the basic epistemological assumptions of this theory, the system of science appears as a communicative system whose fundamental operational code is truth. However, to develop a more detailed elaboration of this system, it is necessary to highlight the core premises of Luhmann’s theory.

Luhmann’s systems theory rests on the concepts of distinction and self-referentiality. Beginning with the notion of distinction, it is important to emphasize the influence of George Spencer Brown (1972) and his conceptualization of systems not through their parts but through the boundary between the system and its environment (Luhmann, 1981, p. 143). The system, in other words, is not defined by its components but by the boundary that separates what the system is from what it is not. We cannot make an indication without drawing a distinction (Spencer Brown, 1972, p 1). This claim is of great analytical importance, as it directs attention not to the internal logic of the system but to the ways in which the system maintains itself in a contingent relation with everything that it is not. Thus, the continuous reproduction of its own existence constitutes the very foundation of the system.

Luhmann extended this logic to social systems, systems that reproduce themselves through meaning. They operate in parallel with psychic systems, which also process meaning (Luhmann, 1995, p. 3). Meaning is reproduced through communication, which is the basic operational principle of social systems. As already stated, for Luhmann, only communication communicates (Luhmann, 2002a, p. 169; Luhmann, 1995, p. 143). This means that he detached communication, unlike most traditions in sociology, from dependence on the individual. Communication exists independently of our individual intentions and engagements; it exists as an operation of the social system, not of psychic systems. For example, the communication of the economic system (payment/non-payment) exists regardless of my will. I may choose whether to purchase, but the process of buying continues independently. Likewise, spoken words are not fully “mine” in the strict sense; they cannot be entirely attributed to me but rather to me as a participant in the communication of a given system. Communication consists of three elements: utterance, information, and understanding (Luhmann, 1995, p. 164). These elements exist beyond the individual who speaks or listens (Seidl, 2007, p. 201).

The anti-humanist approach, therefore, becomes a crucial element of Luhmann’s theory (Moeller, 2011). Sociology does not concern itself with psychic systems, their actions, or their processing of reality, but rather with the ways in which communication takes place and how different social systems reproduce themselves. In modern, contemporary society these systems are independent of one another (we cannot assume a hierarchy of systems). In functionally differentiated societies, no single system, according to Luhmann (1989, p. 14), dominates another. Religion, politics, education, and science, each of these systems possesses autonomy. These systems may couple with one another, but they do not dissolve their respective boundaries. Thus, we can say that contemporary society is polycentric (Roth et al., 2024, p. 253) in the sense that there is no single centre directing communication, a fact evident, for example, in the inability to find a unified solution to the ecological crises of the present age.

One of the social systems of functionally differentiated society is the system of science. The system of science rests on the communication of truth. Its programs (theory and methodology) and operations are directed toward the determination of truth. What is distinctive about the scientific system, however, is that these operations are defined as the observation of other systems (which appear as its environment). Since those systems are themselves self-referential, science observes the ways in which other systems observe their environment. Thus, the scientific system operates based on the second-order observation (observation of how other systems observe phenomena) (Moeller, 2017, p. 31). This observation is tied to rules and forms that make possible the distinction between truth and non-truth, as well as the differentiation of which truth is truly true and which is not (and vice versa) (Roth, 2024, p. 265). Science is therefore a continuous process. A dynamic system of reproducing what is known to be true (and not merely a system in which knowledge resides). Knowledge in this social system is fluid and subordinated to the pursuit of truth. Whereas other systems operate based on the certainty of knowledge, science operates through doubt (Popper, 1995, p. 14). Hence, the logic of second-order observation is both the logic of falsification within the boundaries of science (Popper, 2002) and the reproduction of truth. This constitutes the autopoietic system of science. So, the system of science is not defined by a single secure truth. Such a logic is more characteristic of systems like religion. Science evolves through the uncertainty of content but the certainty of procedure. The legitimacy of the system is established through procedures (Luhmann, 1983, p. 30), through programs aimed at producing true truth at a given moment within the system itself. A single truth, therefore, is never complete. What, for example, was expected of science during the COVID-19 pandemic (that it would provide certainty and accuracy in the moment) could not be fulfilled. Science cannot deliver the expectations of a secure, unified, and immutable answer to the question of truth. Scientific truth is created through condensation of observations of knowledge (Luhmann, 1990, p. 205). This means that scientific truth emerges through the accumulated observation of what has come before. In this way, the temporal continuity of scientific knowledge is produced. At every moment there exists the possibility of further development or falsification. In this sense, we can also turn to Kuhn, who emphasized that even radicals and revolutionaries within the scientific system require tradition (scientific method, procedure, theories) through which science is transformed (Kuhn, 1977).

Science is, therefore, processual, engaged in the continual re-production of true knowledge through processes of second-order observation. Although this type of communication is carried out by scientists, no individual scientist can maintain objectivity. Objectivity is procedural, achieved through peer review and through the continual confirmation or refutation of previous research. To grasp the nature of scientific communication’s relation to a given phenomenon, to determine how science carries out second-order observation of that phenomenon and on what basis, it is necessary to record the condensation of knowledge about the phenomenon over a defined period. This is the central idea of the present paper: to detect the process of producing true knowledge about the phenomenon of generative artificial intelligence in education. By identifying the main thematic units and methodologies, it becomes possible to determine how the scientific system, in its autopoietic nature, relates to the environment it describes. In this sense, the systematic review itself can be regarded as a programmatic operation of the scientific system. As a program, it codifies the procedures through which knowledge is selected, ordered, and evaluated, thereby making the complexity of research processes communicable within the system. Its methodological rigour ensures that observations are not merely accumulated but also subjected to structured rules of inclusion, exclusion, and synthesis. In doing so, the systematic review provides a framework for the condensation and re-evaluation of knowledge within the horizon of truth. What appears as a technical procedure is, from the perspective of social systems theory, a way in which science stabilizes its own autopoiesis (self-reproduction of a system through its own operations): by organizing past communications into a coherent form, the system reproduces itself through renewed possibilities for future observation and falsification. Thus, the systematic review is not only a methodological tool but also a functional expression of the scientific system’s drive to continuously reconstruct truth.

Since the rise of generative AI models, social sciences (e.g. sociology, pedagogy, psychology) are trying to understand the impact these technologies will have on the education system. Education system as a closed system for itself, operating communication in a different term then science, thus, becomes a case for the observation. As education system is not a topic for itself, it is only valuable to mention that this system function is to prepare (or rather reduce complexity of) individual to become person able to participate in the societal communication. Throughout education, individual becomes aware of social conditions of being in society. Although education does not lead to uniform results, it starts from standardized bodies of knowledge that are anticipated to be relevant for the future (Baraldi and Corsi, 2017, p. 49).

It is also interesting to note that there is a tight structural coupling or closeness between education system and system of science. In this coupling, science produces truth claims which are observed and operationalized by education system. Science provides the content for education system. In the same coupling, education system irritates science throughout the production of new participants (future scientists) (Luhmann, 1990, p. 625). This coupling remains important for the observations of both education and science. In that sense, observations of education by the specific form of science – education sciences remain of the importance not only for understanding the scientific communication but also presents the opportunity to address future challenges of education system’s re-production itself – content, programs, operations. However, Luhmann also notes that coupling between the education system and the scientific system is structurally problematic, particularly regarding the question of learnability. Scientific knowledge may be directed on the criteria of truth within the system of science, but this does not automatically make it teachable. What becomes decisive for the education system is therefore not scientific validity as such, but didactics, the methods through which knowledge can be selected, simplified, and adapted for pedagogical purposes. Yet this didactic filtering cannot ensure effective coordination between education and science. As a result, the knowledge transmitted in schools tends to lag behind developments in the scientific system; it often consists of content that science has already moved beyond (Luhmann, 2002b, p. 129ff in Baraldi and Corsi, 2017, p. 46)

There arises the importance of this systematic review. It helps science understand the inner communication around phenomena of generative AI in education but also opens the possibility of understanding the ways the system of education will communicate future educational content.

Systematic review was conducted in line with the PRISMA 2020 framework (Page et al., 2021) to ensure transparency, rigour, and replicability. The process included defining objectives and a protocol, conducting a literature search, screening, quality appraisal, data extraction, and synthesis. The review aimed to systematically analyse how the educational sciences observe and communicate the role of artificial intelligence (AI) in education, with particular attention to its framing as augmentation, automation, or hybrid models. The broader goal was to provide both an empirical synthesis of existing studies and a theoretical reconstruction, showing how the scientific system, through the operations of educational research, programs its communication about AI in education.

While systematic reviews are often presented as objective syntheses of existing knowledge, several scholars have questioned this assumption. Chamberlain (2000) warns against methodolatry or the uncritical belief that methodological rigour alone ensures validity. Similarly, Pollock and Berge (2017) acknowledge that, despite the use of formal frameworks such as PRISMA, systematic reviews still involve interpretive choices that challenge claims of full procedural neutrality. From a systems theory perspective, these critiques are especially relevant: every systematic review is itself a communicative operation that selects, excludes, and orders information according to its own code of truth. In this sense, the systematic review does not merely summarize previous findings but also reproduces a particular form of observation. Engaging with these critical perspectives allows us to see the review not as a neutral mirror of research reality, but as an autopoietic mechanism through which the scientific system organizes its own discourse about evidence.

The time frame (2020–2025) was chosen to capture this transition. This time frame was chosen to capture a critical transition in the scientific communication on AI in education. Mainly, the period marks the shift from earlier applications of AI (such as adaptive learning systems, analytics, and machine learning) toward the rapid emergence and mainstream adoption of generative technologies (particularly large language models such as ChatGPT, released in late 2022). Restricting the review to this five-year span enables the analysis to document both continuity with earlier AI paradigms and the discursive reorientation that generative AI introduced into educational sciences.

A detailed protocol guided the process, specifying research questions, search strategy, inclusion/exclusion criteria, and procedures for data extraction and synthesis. Eligible studies were full-text, English-language works addressing the educational role of AI, including both empirical and theoretical analyses. The search was conducted exclusively in Scopus, selected for its comprehensive coverage of peer-reviewed journals in education and computer science. The decision to rely on a single database (Scopus) and to prioritize the 200 most-cited publications was guided by the aim of tracing stabilized forms of scientific communication. So, in selecting the 200 most cited publications, we do not aim to identify the best or newest studies in a substantive sense, but to reconstruct those communications that the scientific system has already marked as relevant reference points in the discourse on generative AI in education. Citations are treated here as a self-referential mechanism of the scientific system through which certain observations are stabilized and made visible for further communication. Our findings should therefore be read as an observation of stabilized truth-claims and dominant distinctions, rather than as an exhaustive mapping of all available research. The search (July 22–25, 2025) used keywords such as artificial intelligence, education, ChatGPT, large language model, and generative artificial intelligence. From 1,105 initial publications, the 200 most cited were screened; 134 were thematically relevant, but 24 lacked full-text access, leaving a final sample of 110 studies (see Table 1; Figure 1).

The quality of the included studies was assessed using the Mixed Methods Appraisal Tool (MMAT) (Hong et al., 2018), which allowed for a structured evaluation across qualitative, quantitative, and mixed-method designs. Assessment criteria included clarity of research aims, appropriateness of design, transparency of data collection and analysis, coherence between results and conclusions, and acknowledgment of limitations or biases. This ensured a consistent quality threshold and validated the final sample. Of the 110 studies, 75 were eligible for MMAT evaluation, while 35 (31.8%) were theoretical or review papers not applicable for this tool. Quantitative studies dominated: non-randomized designs (36.0%) and descriptive studies (30.7%) formed the largest share, followed by qualitative (25.3%) and mixed methods (6.7%); only one randomized experiment (1.3%) was identified. Most studies scored high on quality, with 53.3% rated 4/5 and 26.7% rated 5/5, while 20.0% scored 3/5, indicating some limitations but overall solid quality. Data extraction followed a structured coding framework, covering bibliographic details (author, year, source, citations) and thematic categories: AI role (augmentation, automation, hybrid), application type, educational level, methodology, outcomes (positive, negative, neutral), and ethical concerns (e.g. bias, transparency, privacy, human agency). The data were organized into a database to enable both quantitative and qualitative synthesis. The codes were not only technical categories but also theoretical instruments of second-order observation – e.g. the code “role of AI” reflects how science distinguishes augmentation from automation; “methodology” how it validates procedures; and “ethical concerns” how it anticipates external pressures from law, politics, or economics. Coding was first conducted by the primary researcher, then reviewed for consistency and depth. This provided the foundation for identifying key trends and patterns discussed in the results.

This chapter presents a systematic analysis of the results on the role of artificial intelligence (AI) in education, combining quantitative indicators (educational level, methodology, type of AI intervention, disciplinary focus) with qualitative interpretation, especially of ethical concerns. Of the 110 studies, 60 (55%) explicitly examined generative AI models such as ChatGPT, while 50 (45%) addressed AI in education more broadly. This distinction highlights both continuity with earlier AI paradigms and the emergence of generative systems.

Most studies focused on higher education (65.5%), reflecting universities' greater institutional capacity, digital literacy, and willingness to adopt new technologies. General education accounted for 22.7%, while primary and secondary levels together made up only about 10% (Table 2). This imbalance suggests that early and secondary education remain underexplored, opening avenues for further research. This dominance of higher education indicates that the educational sciences primarily observe contexts in which science and education are structurally coupled (universities), while earlier educational levels remain less observed.

Methodologically, quantitative approaches predominate. Experimental designs are most common (22.7%), followed by quantitative analyses (20.9%) and systematic reviews (15.5%). Qualitative studies make up 11.8%, while mixed methods are less frequent (4.5%). Nearly 13% of the sample consists of theoretical or philosophical analyses, reflecting a growing interest in AI as not only a technical tool but also an epistemological and social construct (Table 3). The predominance of quantitative approaches can be read as the way in which science stabilizes its own communication through formalized procedures and measurements, while reflective and theoretical analyses are less frequent, reflecting the secondary status of self-referential observations.

Most studies (71.8%) frame AI as augmentation, supporting teaching through personalized learning, intelligent tutoring, learning analytics, and teacher assistance in planning and assessment. In this role, AI empowers rather than replaces educators, emphasizing efficiency while preserving autonomy. This prevailing perception can be interpreted as the way in which science primarily codes AI as support for its own operations, while the possibilities of full automation remain secondary and less frequently articulated. Automation accounts for 18.2% of studies, covering tasks such as grading, administration, and resource allocation. While functionally useful, it is often viewed with ethical ambivalence, especially when linked to student evaluation. Only 3.6% of studies describe hybrid models, highlighting the need for frameworks that balance human oversight with algorithmic efficiency (Table 4). The pronounced role of augmentation shows how science stabilizes communication about AI by coding it as support rather than threat, thereby reducing the risk of irritations in relation to other social systems.

Regarding the temporal dimension, the results reveal a sharp rise in AI-in-education publications: from 2 (2020), 3 (2021), and 6 (2022) to 36 (2023) and 58 (2024), before dropping to 5 (2025) – as it is not a full year. The peak in 2023–2024 marks a shift from an exploratory phase to broader functional applications. Augmentation dominates throughout, growing from 2–3 studies annually (2020–2022) to 31 (2023) and 38 (2024). Automation, initially marginal, rises notably in 2024 (from 2 in 2023 to 15), reflecting greater reliance on AI for assessment and routine tasks. Hybrid roles appear from 2022 but remain rare (1 per year), while unspecified roles occur sporadically. Overall, the trend suggests a move from assistance toward partial substitution, particularly in assessment and feedback (Figure 2).

Thematic analysis of journals and conference proceedings revealed four dominant educational scientific areas, plus a small group of unclassified studies. The largest share came from educational technology and pedagogy (31.8%), focusing on digital innovations, intelligent systems, personalized learning, and the pedagogical impact of technology. Equally represented was computer science/AI/HCI (31.8%), covering algorithm design, performance optimization, user experience, and the integration of technical and usability concerns in education. The medical and health sciences (21.8%) contributed studies on AI in medical education, clinical simulations, digital teaching tools, and assessment systems – contexts marked by high risk and strict regulation, emphasizing the need for ethical and pedagogically sound integration. Multidisciplinary/general science (10%) provided broader perspectives, often highlighting ethics, law, sociology, or philosophy of education. Finally, a small group of unclassified studies (4.6%) reflected emerging research spaces that cross traditional boundaries (Table 5). Overall, the field is highly interdisciplinary, dominated by education and computer science but increasingly open to insights from medicine, psychology, the social sciences, and ethics. This breadth highlights both the complexity of AI in education and the need for collaboration across disciplines to develop sustainable communicative patterns.

The reviewed studies reveal several thematic patterns. A major focus is personalization of learning, where intelligent systems adapt content to students' needs, boosting engagement and autonomy. AI is also widely used to support teachers in assessment, identifying learning difficulties, and curriculum planning. Another theme is teacher professional development, with AI providing automated feedback on instructional practices. In higher education, AI is often applied in medical and STEM fields through simulations, modelling, and augmented reality. Alongside these benefits, many studies raise concerns about “black box” decision-making, where opaque AI processes may undermine transparency and trust. At the level of topical focus, academic writing and content generation dominates in 2023–2024 (11 studies in 2023; 25 in 2024; 2 in 2025), reflecting the explosion of generative tools and their rapid uptake for writing, planning, and content production. In parallel, themes such as ethics, academic integrity, and reflection (2 studies in 2023; 1 in 2024) and assessment and automated feedback (2 studies in 2023; 3 in 2024; 2 in 2025) gain ground, signalling a shift from initial enthusiasm toward reflection and standardization of assessment practices. Niche topics like adaptive tutoring/mentoring (7 studies in 2023; 2 in 2024) and applications in medical education (2 studies in 2023; 1 in 2024) illustrate AI’s penetration into specialized domains. Earlier years (2020–2022) showed a larger share of other/unclassified studies, reflecting an unstable vocabulary and search for descriptors, but this declined in 2023–2024 as discourse normalized and terminology became more precise. Overall, the trajectory suggests three phases: an exploratory stage (2020–2022) emphasizing augmentative benefits, an expansion phase (2023–2024) where generative AI became embedded in academic practices while ethical controls emerged, and a consolidation stage (2025) marked by fewer studies but stable attention to writing and assessment (Figure 3).

A key part of the analysis addresses ethical concerns surrounding AI in education. Four main dimensions are emphasized. First, privacy and data protection: AI systems collect large amounts of student data, often without transparency, clear purpose, or proper consent, raising issues of ownership and governance. This concern reflects irritations originating from the legal system, as questions of data ownership and regulation become matters of legal communication, which science then translates into the form of ethical dilemmas. Second, algorithmic bias: non-representative datasets can reinforce inequalities, with serious implications for fairness and social justice. Science articulates this problem through ethics, but its foundation lies in irritations arising from the political and economic systems, since inequalities and the distribution of resources have systemic consequences. Third, dehumanization through automation: without clear pedagogical goals and human oversight, AI may prioritize efficiency over the social and emotional aspects of learning, challenging teachers' professional roles. This refers to irritations originating from the educational system itself, since the issue of teachers' professional role and self-understanding represents a form of the education system’s self-description. Fourth, responsibility and transparency: when AI influences student trajectories, accountability remains unclear – whether it lies with the algorithm, teacher, institution, or developer. This demonstrates that science imports and translates irritations originating from the legal, political, and economic systems into its own communication, whereby ethical issues become a form of coding the uncertainty and contingency of technology. In cases where AI systems make recommendations or decisions that directly influence a student’s educational trajectory, the question arises: who is accountable for potential negative outcomes – the algorithm, the teacher, the institution, or the software developer? Finally, an increasing number of authors point out that the implementation of AI has the potential to transform the traditional role of the teacher. However, they caution that such transformation must be carried out carefully and with a commitment to preserving the professional autonomy of educators; otherwise, there is a risk of eroding teacher agency and the intrinsic value of the teaching profession. In this way, ethical concerns do not represent merely isolated categories but rather traces of how science imports and translates irritations originating from the legal, political, economic, and educational systems into its own communication about artificial intelligence.

Taken together, the results indicate that scientific communication on AI in education is predominantly concentrated in higher education and largely framed in augmentative terms. This reflects a cautious yet optimistic orientation, where AI is described as a tool that supports and enriches teaching and learning rather than as a disruptive force that replaces human actors. Such a framing not only stabilizes the discourse but also makes the adoption of AI more acceptable within academic contexts, as it emphasizes teacher autonomy, learner empowerment, and the improvement of pedagogical processes. Methodologically, the predominance of quantitative approaches reveals a strong reliance on formalized procedures, measurements, and experimental designs to provide legitimacy and rigour. While this secures a degree of stability and comparability across studies, it also results in a relative scarcity of reflective, critical, and theoretical analyses. These perspectives, which could offer deeper second-order observations about the epistemological, social, and cultural implications of AI, remain underrepresented. At the same time, the recurring attention to ethical concerns highlights unresolved tensions at the boundaries between science and other social systems. Issues of privacy and data protection draw directly on legal discourses, while questions of bias and inequality resonate with political and economic struggles over resources and fairness. Concerns about dehumanization and teacher autonomy reflect education’s self-description, where professional roles and the relational aspects of teaching are at stake.

The educational sciences primarily frame artificial intelligence in terms of augmentation, with automation playing a secondary role and hybrid models only marginally represented. The results align in part with Wang et al. (2024), whose large-scale bibliometric and content analysis identified four dominant categories of AI applications in education, adaptive learning and personalized tutoring, intelligent assessment and management, profiling and prediction, and emerging technologies such as educational robotics and AR/VR. Their review highlights the predominance of adaptive and tutoring functions and the relative marginality of automation-oriented designs. Also, Wang et al. (2024) interpret this distribution primarily as an empirical trend within educational technology research, our analysis situates it within the logic of systemic communication. The preference for augmentation over automation, visible in both datasets, reflects not only a methodological or disciplinary bias but also the way the scientific system programs its own communication to minimize disruption and maintain continuity with established educational forms.

So, the predominance of augmentation is not accidental. It shows how the scientific system seeks continuity while avoiding threats to professional roles and institutions. Science observes not only what is happening, but also how that happening can be communicated within its own operations.

Review in that sense, demonstrated the logic of second-order observation. What is captured here is not AI in education as such, but the way the scientific system of education observes the education system’s own observation of generative AI. Education is a site where multiple roles and expectations of teachers, learners, institutions, and technologies interplay and compete. Faced with contingency, education tends to describe AI as a tool for augmentation first: a way of supporting teachers, personalizing learning, and optimizing assessment without displacing the human core of pedagogy. The scientific system of education, in turn, records this orientation, stabilizing it in its own categories, programs and operations.

The stabilization of AI as supportive also carries broader societal consequences. When scientific discourse consistently presents AI as a tool of assistance rather than disruption, it shapes how educational institutions, policymakers, and the public perceive technological change – how the irritation in other systems is produced. Such a framing can depoliticize debates about automation, diverting attention from questions of labour, equity, and democratic oversight. For instance, if AI is viewed mainly as a neutral enhancer of teaching, structural issues, such as unequal access to technology, the precarization of academic labour, or the reinforcement of algorithmic bias, may remain unaddressed.

But to understand the process we must also turn to the self-referential, autopoietic character of the educational sciences themselves. What the results reveal is the way in which the scientific system continuously reproduces its own operations through time. Between 2020 and 2022, the discourse was fragmented, marked by experimental descriptors and exploratory case studies. This early phase illustrates how the system searches for distinctions capable of making the new phenomenon communicable. By 2023 and 2024, however, the number of publications increased, and categories such as augmentation, automation, and academic writing stabilized as dominant categories. In 2025, programs appear to move meaning of the phenomena into the stage of consolidation, with fewer studies but more consistent communication. Science condenses complexity into durable forms. It first multiplies possibilities of description and then selectively reduces them.

Also, it is interesting to detect how disciplinary characteristic of scientific system can be observed. It is easy to detect the ways in which each discipline responds in their own way regarding the AI and Education. Pedagogy framed AI in terms of curriculum, and learning outcomes, while computer science emphasized performance, usability, and algorithmic design. Medical sciences, in turn, introduced questions of risk, responsibility, and simulation in high-stakes contexts (which aligns with study conducted by Salem et al., 2025). In this sense, the educational sciences are not passively reflecting external change but actively producing their own future horizons.

It is shown that research on AI in education has been overwhelmingly concentrated on higher education, while primary and secondary education remain comparatively marginal. This exclusion is not incidental but reflects how the scientific system tends to privilege contexts most closely coupled with its own operations, namely universities. So, scientific system of education does not simply mirror technological change but transforms it into distinctions that secure its own continuity. What appears as a debate about augmentation, automation, or ethics is, in fact, an autopoietic process through which science observes education observing new technology. Generative AI is therefore not an external revolution. It is a medium through which educational sciences reconstruct their own future, a future shaped neither by technology alone nor by education itself, but by their ongoing interaction. This can be supported by recent systems-theoretical elaborations (Watson and Romic, 2024) which extend Luhmann’s framework by introducing a bimodal understanding of technology as both an instrumental and a semantic medium, which mediates between thought, communication, and society. This distinction is highly relevant to the present analysis, as it can reveal how AI technologies such as ChatGPT function not only as operational tools but also as communicative structures that stabilize meaning.

This article had a dual aim. Beyond only presenting an empirical synthesis of the dominant themes, distinctions, and methodological patterns in AI in education research, the study demonstrated how a systematic review can itself serve as the second order observation of scientific second order observation. In other words, it shows the theoretical potential of systematic reviews when coupled with concepts from social systems theory. Instead of being focused only to a descriptive synthesis of existing knowledge, the systematic review becomes a method through which the autopoietic nature of science can be observed. This opens the new possibilities for sociological investigations of science but also the social phenomena. Systematic reviews, combined with theoretical reflection, can be used as instruments to detect how new technologies are translated into stabilized distinctions within science (and beyond – e.g education). Through systematic reviews, in other words, we can trace how science produces, through its own operations, true distinctions (see Roth et al., 2025). It shows us how disciplines selectively observe other systems, and how science extends its temporal horizon through the reproduction of truth. Generative AI, in this sense, serves less as an object of study than as a medium for observing the self-referential operations of science. In doing so, the present review contributes both to the substantive debate on AI in education and to the methodological and theoretical project of showing how systematic reviews can become tools for second-order observation of science itself.

The empirical investigation in this dual aimed approach has shown us some interesting self-referentiality of science regarding the study of generative AI in Education. In the analysis it has been shown that scientific communication on generative AI in education is characterized less by disruptive redefinitions of pedagogy and more by stabilizing distinctions that frame AI primarily as a tool of augmentation. Also, higher education dominates as the central site of observation, not because it is the sole locus of transformation, but because it is most closely coupled with the system of science itself. By contrast, primary and secondary education remain marginal (at least in the top cited research), signalling a structural asymmetry in how science selects its objects of communication.

The predominance of augmentation-oriented studies (71.8%) reveals a tendency to emphasize supportive rather than substitutive roles of AI, often reducing ethical concerns to procedural issues. Linking these quantitative patterns with qualitative insights, the review shows that ethical dilemmas, such as data privacy, bias, or teacher autonomy, emerge most clearly when automation becomes more pronounced.

Beyond the academic domain, and throughout the synthesis of two aims, the findings can serve as a new source of irritation for other social systems, particularly education itself. Rather than offering prescriptive guidance for policy or practice, the results invite the educational or political system to observe how science communicates about generative AI and to translate this observation into its own communicative code. In this sense, the findings do not demand uniform adaptation of scientific insights but stimulate reflexive processes within education, encouraging it to reinterpret the scientific discourse in ways compatible with its own operations, programs, and functions. Such systemic translation may lead to differentiated responses across educational subsystems, curriculum design, teacher training, or institutional governance, each of which can selectively internalize the irritation according to its own logic. The same logic applies for the political system. The findings (result and the theoretical conceptualization) provide politics with structured observations of how science frames the risks, potentials, and limits of generative AI, which can then be re-coded in terms of power, decision-making, and government/opposition dynamics.

Moreover, the finding that science predominantly codes AI as supportive rather than disruptive has implications for how AI literacy is framed within educational policy and teacher training. If the dominant narrative portrays AI as a benign assistant, literacy initiatives may focus narrowly on technical skills while overlooking critical awareness of automation, surveillance, and bias. What a systemic approach offers, in this sense, is a crucial reflective foundation upon which further critical endeavours and analyses can be grounded. We therefore do not regard critical and systemic thinking as mutually opposed, but as mutually reinforcing. Following Burawoy (2021), systemic analysis provides the professional grounding of sociology, enabling critical approaches that are both empirically informed and theoretically rigorous.

This review relied exclusively on the Scopus database, which represents a significant limitation, as relevant studies indexed elsewhere (e.g. Web of Science, ERIC, PsycINFO) may have been missed. The use of citation-based selection (the 200 most-cited papers) also introduces bias toward dominant and established narratives, potentially overlooking emerging or critical voices that challenge mainstream perspectives. Similarly, the English-only restriction limits cross-cultural variation in how generative AI is discussed and may obscure regionally specific discourses. The exclusion of editorials and opinion pieces, while theoretically consistent with the focus on stabilized scientific communication, narrows the analysis to already institutionalized forms of knowledge. From a systems-theoretical standpoint, these exclusions are not merely technical choices but analytical distinctions: they make visible how the scientific system maintains its boundaries by privileging codified, peer-reviewed communication over more fluid, boundary-spanning forms. Nonetheless, future research could complement this study by systematically including non-English sources, emerging databases, and boundary communications to trace how alternative observations of AI in education evolve at the margins of scientific discourse.

In the preparation of this manuscript, the authors used ChatGPT-5, a generative AI tool developed by OpenAI, for translation and language correction when translating text from Croatian to English. The tool was used solely to improve linguistic clarity and ensure accurate translation (with author’s corrections afterwards), without generating or altering content. All intellectual and conceptual work remains the responsibility of the authors.

Aromataris
,
E.
and
Pearson
,
A.
(
2014
), “
The systematic review: an overview
”,
American Journal of Nursing
, Vol. 
114
No. 
3
, pp. 
53
-
58
, doi: .
Baraldi
,
C.
and
Corsi
,
G.
(
2017
),
Niklas Luhmann. Education as a Social System
,
Springer
,
Cham, Switzerland
.
Brezovec
,
E.
,
Ježovita
,
J.
and
Watson
,
S.
(
2025
), “
Theory and methodology in sociology–guiding distinction
”,
Systems Research and Behavioral Science
, Vol. 
42
No. 
2
, pp. 
1
-
11
, doi: .
Burawoy
,
M.
(
2021
),
Public Sociology: Between Utopia and Anti-utopia
,
Polity Press
,
Cambridge and Medford, MA
.
Chamberlain
,
K.
(
2000
), “
Methodolatry and qualitative health research
”,
Journal of Health Psychology
, Vol. 
5
No. 
3
, pp. 
285
-
296
, doi: .
Gough
,
D.
and
Richardson
,
M.
(
2024
), “Systematic reviews”,
Menzies
 
A.
,
Booth
,
D.
and
Payne
,
S.
(Eds),
Advanced Research Methods for Applied Psychology
,
Routledge
,
London
, pp.
16
-
32
.
Hong
,
Q.N.
,
Pluye
,
P.
,
Fàbregues
,
S.
,
Bartlett
,
G.
,
Boardman
,
F.
,
Cargo
,
M.
,
Dagenais
,
P.
,
Gagnon
,
M.P.
,
Griffiths
,
F.
,
Nicolau
,
B.
,
O’Cathain
,
A.
,
Rousseau
,
M.-C.
and
Vedel
,
I.
(
2018
),
Mixed Methods Appraisal Tool (MMAT)
,
McGill University, Department of Family Medicine
,
Montreal
.
Ježovita
,
J.
(
2021
), “
Possibilities, potential and problems of research standardization: a case of bringing EVS and ESS together
”,
Revija za sociologiju
, Vol. 
51
No. 
3
, pp. 
495
-
506
.
Kuhn
,
T.S.
(
1977
),
Essential Tension
,
The University of Chicago Press
,
Chicago
.
Luhmann
,
N.
(
1981
),
Teorija Sistema: Svrhovitost i Racionalnost
,
Globus
,
Zagreb
.
Luhmann
,
N.
(
1983
),
Legitimation Durch Verfahren
,
Suhrkamp
,
Frankfurt am Main
.
Luhmann
,
N.
(
1989
),
Ecological Communication
,
Polity Press
,
Cambridge
.
Luhmann
,
N.
(
1990
),
Die Wissenschaft der Gesellschaft
,
Suhrkamp
,
Frankfurt am Main
.
Luhmann
,
N.
(
1995
),
Social Systems
,
Stanford University Press
,
Redwood City, CA
.
Luhmann
,
N.
(
2002a
),
Theories of Distinction: Redescribing the Descriptions of Modernity
,
Stanford University Press
,
Stanford, CA
.
Luhmann
,
N.
(
2002b
),
Das Erziehungssystem der Gesellschaft
,
Suhrkamp
,
Frankfurt
.
Moeller
,
H.G.
(
2011
),
The Radical Luhmann
,
Columbia University Press
,
New York
.
Moeller
,
H.G.
(
2017
), “
On second-order observation and genuine pretending: coming to terms with society
”,
Thesis Eleven
, Vol. 
143
No. 
1
, pp. 
28
-
43
, doi: .
Mulrow
,
C.D.
(
1994
), “
Systematic reviews: rationale for systematic reviews
”,
British Medical Journal
, Vol. 
309
No. 
6954
, pp. 
597
-
599
, doi: .
Page
,
M.J.
,
McKenzie
,
J.E.
,
Bossuyt
,
P.M.
,
Boutron
,
I.
,
Hoffmann
,
T.C.
,
Mulrow
,
C.D.
,
Shamseer
,
L.
,
Tetzlaff
,
J.M.
,
Akl
,
E.A.
,
Brennan
,
S.E.
,
Chou
,
R.
,
Glanville
,
J.
,
Grimshaw
,
J.M.
,
Hróbjartsson
,
A.
,
Lalu
,
M.M.
,
Li
,
T.
,
Loder
,
E.W.
,
Mayo-Wilson
,
E.
,
McDonald
,
S.
,
McGuinness
,
L.A.
,
Stewart
,
L.A.
,
Thomas
,
J.
,
Tricco
,
A.C.
,
Welch
,
V.A.
,
Whiting
,
P.
and
Moher
,
D.
(
2021
), “
The PRISMA 2020 statement: an updated guideline for reporting systematic reviews
”,
British Medical Journal
, Vol. 
372
, p.
n71
, doi: .
Petticrew
,
M.
and
Roberts
,
H.
(
2006
),
Systematic Reviews in the Social Sciences: a Practical Guide
,
Blackwell Publishing
,
Oxford
.
Pollock
,
A.
and
Berge
,
E.
(
2017
), “
How to do a systematic review
”,
International Journal of Stroke
, Vol. 
13
No. 
2
, pp. 
138
-
156
, doi: .
Popper
,
K.
(
1995
),
In Search of a Better World: Lectures and Essays from Thirty Years
, (1st ed.) ,
Routledge
,
London and New York, NY
.
Popper
,
K.
(
2002
),
The Logic of Scientific Discovery
,
Routledge Classics
,
London
.
Popper
,
K.
(
2003
),
The Open Society and its Enemies
,
Routledge Classics
,
London
.
Roth
,
S.
(
2024
), “
Truth tables, true distinctions. Paradoxes of the source code of science
”,
Systemic Practice and Action Research
, Vol. 
37
No. 
3
, pp. 
261
-
267
, doi: .
Roth
,
S.
,
Žažar
,
K.
,
Stingl de Vasconcelos Guedes
,
T.
and
others
(
2024
), “
Scientific communication observed with social systems theory: an introduction and outlook to pure science for society
”,
Systemic Practice and Action Research
, Vol. 
37
No. 
2
, pp. 
251
-
260
, doi: .
Roth
,
S.
,
Watson
,
S.
,
Möller
,
S.
,
Clausen
,
L.
,
Žažar
,
K.
,
Dahms
,
H.
,
Sales
,
A.
and
Lien
,
V.
(
2025
), “
Guiding distinctions of social theory: results from two online brainstormings and one quantitative analysis of the ISA books of the XX century corpus
”,
Current Sociology
, Vol. 
73
No. 
4
, pp. 
629
-
650
, doi: .
Salem
,
M.A.
,
Zakaria
,
O.M.
,
Aldoughan
,
E.A.
,
Khalil
,
Z.A.
and
Zakaria
,
H.M.
(
2025
), “
Bridging the AI gap in medical education: a study of competency, readiness, and ethical perspectives in developing nations computers
”,
Computers
, Vol. 
14
No. 
6
, p.
238
, doi: .
Seidl
,
G.
(
2007
), “
General strategy concepts and the ecology of strategy discourses: a systemic-discursive perspective
”,
Organization Studies
, Vol. 
28
No. 
2
, pp. 
197
-
218
, doi: .
Spencer Brown
,
G.
(
1972
),
Laws of Form
,
The Julian Press
,
New York
.
Wang
,
S.
,
Wang
,
F.
,
Zhu
,
Z.
,
Wang
,
J.
,
Tran
,
T.
and
Du
,
Z.
(
2024
), “
Artificial intelligence in education: a systematic literature review
”,
Expert Systems with Applications
, Vol. 
2
, 124167, doi: .
Watson
,
S.
and
Romic
,
J.
(
2024
), “
ChatGPT and the entangled evolution of society, education, and technology: a systems theory perspective
”,
European Educational Research Journal
, Vol. 
24
No. 
2
, pp. 
205
-
224
, doi: .
Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at Link to the terms of the CC BY 4.0 licence.

Data & Figures

Figure 1
A flowchart shows the stepwise selection of studies from identification to inclusion in the final analysis.The flow diagram shows four section headings arranged vertically on the left side: “Identification”, “Filtration”, “Eligibility”, and “Included”. The flowchart contains four text boxes arranged vertically in the center. Text box 1 is labeled “Studies identified through Scopus (2020 to 2025) N equals 1105”. Text box 2 is labeled “Top 200 most-cited studies selected N equals 200”. Text box 3 is labeled “Thematically relevant (A I role in education) studies points to N equals 134. Excluded (full-text not accessible) points to N equals 24”. Text box 4 is labeled “Final studies included in analysis, N equals 110”. Text box 1 is placed under the heading “Identification”, text box 2 is placed under the heading “Filtration”, text box 3 is placed under the heading “Eligibility”, and text box 4 is placed under the heading “Included”. Text box 1 is connected to text box 2 with a downward arrow. Text box 2 is connected to text box 3 with a downward arrow. Text box 3 is connected to text box 4 with a downward arrow.

The study selection process. Source: Authors’ own work

Figure 1
A flowchart shows the stepwise selection of studies from identification to inclusion in the final analysis.The flow diagram shows four section headings arranged vertically on the left side: “Identification”, “Filtration”, “Eligibility”, and “Included”. The flowchart contains four text boxes arranged vertically in the center. Text box 1 is labeled “Studies identified through Scopus (2020 to 2025) N equals 1105”. Text box 2 is labeled “Top 200 most-cited studies selected N equals 200”. Text box 3 is labeled “Thematically relevant (A I role in education) studies points to N equals 134. Excluded (full-text not accessible) points to N equals 24”. Text box 4 is labeled “Final studies included in analysis, N equals 110”. Text box 1 is placed under the heading “Identification”, text box 2 is placed under the heading “Filtration”, text box 3 is placed under the heading “Eligibility”, and text box 4 is placed under the heading “Included”. Text box 1 is connected to text box 2 with a downward arrow. Text box 2 is connected to text box 3 with a downward arrow. Text box 3 is connected to text box 4 with a downward arrow.

The study selection process. Source: Authors’ own work

Close modal
Figure 2
A stacked bar chart shows stacked counts of A I roles from 2020 to 2025.The stacked bar chart titled “A I Roles by Year” shows a horizontal axis labeled “Year” and ranges from 2020 to 2025 in increments of 1 year. The vertical axis is labeled “Number of papers” and ranges from 0 to 60 in increments of 10 units. Four stacked vertical bars are shown for each year, representing “Augmentation”, “Automatization”, “Hybrid”, and “Unspecified”. A legend on the upper right identifies these categories. The data for the stacked bars on the graph are as follows: 2020: Augmentation: 2, Automatization: 0, Hybrid: 0, Unspecified: 0. 2021: Augmentation: 3, Automatization: 0, Hybrid: 0, Unspecified: 0. 2022: Augmentation: 3, Automatization: 2, Hybrid: 1, Unspecified: 0. 2023: Augmentation: 31, Automatization: 2, Hybrid: 1, Unspecified: 2. 2024: Augmentation: 38, Automatization: 15, Hybrid: 1, Unspecified: 4. 2025: Augmentation: 2, Automatization: 1, Hybrid: 1, Unspecified: 1. Note: All numerical data values are approximated.

AI roles by year. Source: Authors’ own work

Figure 2
A stacked bar chart shows stacked counts of A I roles from 2020 to 2025.The stacked bar chart titled “A I Roles by Year” shows a horizontal axis labeled “Year” and ranges from 2020 to 2025 in increments of 1 year. The vertical axis is labeled “Number of papers” and ranges from 0 to 60 in increments of 10 units. Four stacked vertical bars are shown for each year, representing “Augmentation”, “Automatization”, “Hybrid”, and “Unspecified”. A legend on the upper right identifies these categories. The data for the stacked bars on the graph are as follows: 2020: Augmentation: 2, Automatization: 0, Hybrid: 0, Unspecified: 0. 2021: Augmentation: 3, Automatization: 0, Hybrid: 0, Unspecified: 0. 2022: Augmentation: 3, Automatization: 2, Hybrid: 1, Unspecified: 0. 2023: Augmentation: 31, Automatization: 2, Hybrid: 1, Unspecified: 2. 2024: Augmentation: 38, Automatization: 15, Hybrid: 1, Unspecified: 4. 2025: Augmentation: 2, Automatization: 1, Hybrid: 1, Unspecified: 1. Note: All numerical data values are approximated.

AI roles by year. Source: Authors’ own work

Close modal
Figure 3
A stacked bar chart shows primary themes per paper from 2020 to 2025.The stacked bar chart titled “Primary Theme per Paper by Year” shows a horizontal axis labeled “Year” and ranges from 2020 to 2025 in increments of 1 year. The vertical axis is labeled “Number of papers” and ranges from 0 to 50 in increments of 10 units. Ten stacked vertical bars appear for each year, representing the categories “Academic Writing and Content Generation”, “Other or Unclassified”, “Adaptive Tutoring or Mentoring”, “Personalization”, “Assessment and Automated Feedback”, “Teacher Support”, “Transformation of Educational Paradigm”, “Applications in Medical Education”, “Ethics, Academic Integrity and Reflection”, and “Learning Analytics”. A legend on the upper right identifies these categories. The data for the stacked bars on the graph are as follows: 2020: Academic Writing and Content Generation: 0, Other or Unclassified: 1, Adaptive Tutoring or Mentoring: 0, Personalization: 0, Assessment and Automated Feedback: 0, Teacher Support: 0, Transformation of Educational Paradigm: 0, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 1. 2021: Academic Writing and Content Generation: 1, Other or Unclassified: 1, Adaptive Tutoring or Mentoring: 0, Personalization: 0, Assessment and Automated Feedback: 0, Teacher Support: 1, Transformation of Educational Paradigm: 0, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 0. 2022: Academic Writing and Content Generation: 1, Other or Unclassified: 2, Adaptive Tutoring or Mentoring: 1, Personalization: 0, Assessment and Automated Feedback: 0, Teacher Support: 1, Transformation of Educational Paradigm: 1, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 0. 2023: Academic Writing and Content Generation: 11, Other or Unclassified: 8, Adaptive Tutoring or Mentoring: 7, Personalization: 2, Assessment and Automated Feedback: 2, Teacher Support: 1, Transformation of Educational Paradigm: 1, Applications in Medical Education: 2, Ethics, Academic Integrity and Reflection: 2, Learning Analytics: 0. 2024: Academic Writing and Content Generation: 25, Other or Unclassified: 15, Adaptive Tutoring or Mentoring: 2, Personalization: 5, Assessment and Automated Feedback: 3, Teacher Support: 4, Transformation of Educational Paradigm: 2, Applications in Medical Education: 1, Ethics, Academic Integrity and Reflection: 1, Learning Analytics: 0. 2025: Academic Writing and Content Generation: 2, Other or Unclassified: 0, Adaptive Tutoring or Mentoring: 0, Personalization: 1, Assessment and Automated Feedback: 2, Teacher Support: 0, Transformation of Educational Paradigm: 0, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 0. Note: All numerical data values are approximated.

Primary theme per paper by year. Source: Authors’ own work

Figure 3
A stacked bar chart shows primary themes per paper from 2020 to 2025.The stacked bar chart titled “Primary Theme per Paper by Year” shows a horizontal axis labeled “Year” and ranges from 2020 to 2025 in increments of 1 year. The vertical axis is labeled “Number of papers” and ranges from 0 to 50 in increments of 10 units. Ten stacked vertical bars appear for each year, representing the categories “Academic Writing and Content Generation”, “Other or Unclassified”, “Adaptive Tutoring or Mentoring”, “Personalization”, “Assessment and Automated Feedback”, “Teacher Support”, “Transformation of Educational Paradigm”, “Applications in Medical Education”, “Ethics, Academic Integrity and Reflection”, and “Learning Analytics”. A legend on the upper right identifies these categories. The data for the stacked bars on the graph are as follows: 2020: Academic Writing and Content Generation: 0, Other or Unclassified: 1, Adaptive Tutoring or Mentoring: 0, Personalization: 0, Assessment and Automated Feedback: 0, Teacher Support: 0, Transformation of Educational Paradigm: 0, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 1. 2021: Academic Writing and Content Generation: 1, Other or Unclassified: 1, Adaptive Tutoring or Mentoring: 0, Personalization: 0, Assessment and Automated Feedback: 0, Teacher Support: 1, Transformation of Educational Paradigm: 0, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 0. 2022: Academic Writing and Content Generation: 1, Other or Unclassified: 2, Adaptive Tutoring or Mentoring: 1, Personalization: 0, Assessment and Automated Feedback: 0, Teacher Support: 1, Transformation of Educational Paradigm: 1, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 0. 2023: Academic Writing and Content Generation: 11, Other or Unclassified: 8, Adaptive Tutoring or Mentoring: 7, Personalization: 2, Assessment and Automated Feedback: 2, Teacher Support: 1, Transformation of Educational Paradigm: 1, Applications in Medical Education: 2, Ethics, Academic Integrity and Reflection: 2, Learning Analytics: 0. 2024: Academic Writing and Content Generation: 25, Other or Unclassified: 15, Adaptive Tutoring or Mentoring: 2, Personalization: 5, Assessment and Automated Feedback: 3, Teacher Support: 4, Transformation of Educational Paradigm: 2, Applications in Medical Education: 1, Ethics, Academic Integrity and Reflection: 1, Learning Analytics: 0. 2025: Academic Writing and Content Generation: 2, Other or Unclassified: 0, Adaptive Tutoring or Mentoring: 0, Personalization: 1, Assessment and Automated Feedback: 2, Teacher Support: 0, Transformation of Educational Paradigm: 0, Applications in Medical Education: 0, Ethics, Academic Integrity and Reflection: 0, Learning Analytics: 0. Note: All numerical data values are approximated.

Primary theme per paper by year. Source: Authors’ own work

Close modal
Table 1

Inclusion and exclusion criteria for article selection

CriterionInclusionExclusion
Article TopicStudies addressing the role of AI in education (augmentation, automation, or hybrid applications)Studies not focused on the role of AI in education
Article TypeEmpirical studies and theoretical analysesEditorials, commentaries, opinion papersa
Article AvailabilityAvailable as full textNot available in full text
Article LanguageEnglishNon-english

Note(s):

a

Editorials and opinion pieces were excluded because, within a systems-theoretical framework, they represent forms of public discourse rather than stabilized scientific communication. The review was limited to empirical and theoretical studies that constitute formalized knowledge production within the scientific system

Source(s): Authors’ own work
Table 2

Level of education

FrequencyPercent
General education2522.7
Primary and secondary education32.7
Secondary and higher education54.5
Secondary education43.6
Vocational education10.9
Higher education7265.5
Total110100.0
Source(s): Authors’ own work
Table 3

Methodological approach

FrequencyPercent
Action research10.9
Bibliometric review32.7
Experiment2522.7
Philosophical reflection21.8
Qualitative analysis1311.8
Quantitative analysis2320.9
Quasi-experiment32.7
Mixed methods54.5
Description of tools and development methods10.9
Systematic review1715.5
Case study54.5
Theoretical analysis1210.9
Total110100.0
Source(s): Authors’ own work
Table 4

AI's role in education

FrequencyPercent
Augmentation7971.8
Automation2018.2
Hybrid43.6
Unspecified / Undefined76.4
Total110100.0
Source(s): Authors’ own work
Table 5

Distribution of publications by scientific focus of journals

FrequencyPercent
Educational Technology/Pedagogy3531.8
Computer Science/AI/ Human-Computer Interaction3531.8
Medical/Health Sciences2421.8
Multidisciplinary/General Science1110.0
Unknown54.6
Total110100.0
Source(s): Authors’ own work

Supplements

References

Aromataris
,
E.
and
Pearson
,
A.
(
2014
), “
The systematic review: an overview
”,
American Journal of Nursing
, Vol. 
114
No. 
3
, pp. 
53
-
58
, doi: .
Baraldi
,
C.
and
Corsi
,
G.
(
2017
),
Niklas Luhmann. Education as a Social System
,
Springer
,
Cham, Switzerland
.
Brezovec
,
E.
,
Ježovita
,
J.
and
Watson
,
S.
(
2025
), “
Theory and methodology in sociology–guiding distinction
”,
Systems Research and Behavioral Science
, Vol. 
42
No. 
2
, pp. 
1
-
11
, doi: .
Burawoy
,
M.
(
2021
),
Public Sociology: Between Utopia and Anti-utopia
,
Polity Press
,
Cambridge and Medford, MA
.
Chamberlain
,
K.
(
2000
), “
Methodolatry and qualitative health research
”,
Journal of Health Psychology
, Vol. 
5
No. 
3
, pp. 
285
-
296
, doi: .
Gough
,
D.
and
Richardson
,
M.
(
2024
), “Systematic reviews”,
Menzies
 
A.
,
Booth
,
D.
and
Payne
,
S.
(Eds),
Advanced Research Methods for Applied Psychology
,
Routledge
,
London
, pp.
16
-
32
.
Hong
,
Q.N.
,
Pluye
,
P.
,
Fàbregues
,
S.
,
Bartlett
,
G.
,
Boardman
,
F.
,
Cargo
,
M.
,
Dagenais
,
P.
,
Gagnon
,
M.P.
,
Griffiths
,
F.
,
Nicolau
,
B.
,
O’Cathain
,
A.
,
Rousseau
,
M.-C.
and
Vedel
,
I.
(
2018
),
Mixed Methods Appraisal Tool (MMAT)
,
McGill University, Department of Family Medicine
,
Montreal
.
Ježovita
,
J.
(
2021
), “
Possibilities, potential and problems of research standardization: a case of bringing EVS and ESS together
”,
Revija za sociologiju
, Vol. 
51
No. 
3
, pp. 
495
-
506
.
Kuhn
,
T.S.
(
1977
),
Essential Tension
,
The University of Chicago Press
,
Chicago
.
Luhmann
,
N.
(
1981
),
Teorija Sistema: Svrhovitost i Racionalnost
,
Globus
,
Zagreb
.
Luhmann
,
N.
(
1983
),
Legitimation Durch Verfahren
,
Suhrkamp
,
Frankfurt am Main
.
Luhmann
,
N.
(
1989
),
Ecological Communication
,
Polity Press
,
Cambridge
.
Luhmann
,
N.
(
1990
),
Die Wissenschaft der Gesellschaft
,
Suhrkamp
,
Frankfurt am Main
.
Luhmann
,
N.
(
1995
),
Social Systems
,
Stanford University Press
,
Redwood City, CA
.
Luhmann
,
N.
(
2002a
),
Theories of Distinction: Redescribing the Descriptions of Modernity
,
Stanford University Press
,
Stanford, CA
.
Luhmann
,
N.
(
2002b
),
Das Erziehungssystem der Gesellschaft
,
Suhrkamp
,
Frankfurt
.
Moeller
,
H.G.
(
2011
),
The Radical Luhmann
,
Columbia University Press
,
New York
.
Moeller
,
H.G.
(
2017
), “
On second-order observation and genuine pretending: coming to terms with society
”,
Thesis Eleven
, Vol. 
143
No. 
1
, pp. 
28
-
43
, doi: .
Mulrow
,
C.D.
(
1994
), “
Systematic reviews: rationale for systematic reviews
”,
British Medical Journal
, Vol. 
309
No. 
6954
, pp. 
597
-
599
, doi: .
Page
,
M.J.
,
McKenzie
,
J.E.
,
Bossuyt
,
P.M.
,
Boutron
,
I.
,
Hoffmann
,
T.C.
,
Mulrow
,
C.D.
,
Shamseer
,
L.
,
Tetzlaff
,
J.M.
,
Akl
,
E.A.
,
Brennan
,
S.E.
,
Chou
,
R.
,
Glanville
,
J.
,
Grimshaw
,
J.M.
,
Hróbjartsson
,
A.
,
Lalu
,
M.M.
,
Li
,
T.
,
Loder
,
E.W.
,
Mayo-Wilson
,
E.
,
McDonald
,
S.
,
McGuinness
,
L.A.
,
Stewart
,
L.A.
,
Thomas
,
J.
,
Tricco
,
A.C.
,
Welch
,
V.A.
,
Whiting
,
P.
and
Moher
,
D.
(
2021
), “
The PRISMA 2020 statement: an updated guideline for reporting systematic reviews
”,
British Medical Journal
, Vol. 
372
, p.
n71
, doi: .
Petticrew
,
M.
and
Roberts
,
H.
(
2006
),
Systematic Reviews in the Social Sciences: a Practical Guide
,
Blackwell Publishing
,
Oxford
.
Pollock
,
A.
and
Berge
,
E.
(
2017
), “
How to do a systematic review
”,
International Journal of Stroke
, Vol. 
13
No. 
2
, pp. 
138
-
156
, doi: .
Popper
,
K.
(
1995
),
In Search of a Better World: Lectures and Essays from Thirty Years
, (1st ed.) ,
Routledge
,
London and New York, NY
.
Popper
,
K.
(
2002
),
The Logic of Scientific Discovery
,
Routledge Classics
,
London
.
Popper
,
K.
(
2003
),
The Open Society and its Enemies
,
Routledge Classics
,
London
.
Roth
,
S.
(
2024
), “
Truth tables, true distinctions. Paradoxes of the source code of science
”,
Systemic Practice and Action Research
, Vol. 
37
No. 
3
, pp. 
261
-
267
, doi: .
Roth
,
S.
,
Žažar
,
K.
,
Stingl de Vasconcelos Guedes
,
T.
and
others
(
2024
), “
Scientific communication observed with social systems theory: an introduction and outlook to pure science for society
”,
Systemic Practice and Action Research
, Vol. 
37
No. 
2
, pp. 
251
-
260
, doi: .
Roth
,
S.
,
Watson
,
S.
,
Möller
,
S.
,
Clausen
,
L.
,
Žažar
,
K.
,
Dahms
,
H.
,
Sales
,
A.
and
Lien
,
V.
(
2025
), “
Guiding distinctions of social theory: results from two online brainstormings and one quantitative analysis of the ISA books of the XX century corpus
”,
Current Sociology
, Vol. 
73
No. 
4
, pp. 
629
-
650
, doi: .
Salem
,
M.A.
,
Zakaria
,
O.M.
,
Aldoughan
,
E.A.
,
Khalil
,
Z.A.
and
Zakaria
,
H.M.
(
2025
), “
Bridging the AI gap in medical education: a study of competency, readiness, and ethical perspectives in developing nations computers
”,
Computers
, Vol. 
14
No. 
6
, p.
238
, doi: .
Seidl
,
G.
(
2007
), “
General strategy concepts and the ecology of strategy discourses: a systemic-discursive perspective
”,
Organization Studies
, Vol. 
28
No. 
2
, pp. 
197
-
218
, doi: .
Spencer Brown
,
G.
(
1972
),
Laws of Form
,
The Julian Press
,
New York
.
Wang
,
S.
,
Wang
,
F.
,
Zhu
,
Z.
,
Wang
,
J.
,
Tran
,
T.
and
Du
,
Z.
(
2024
), “
Artificial intelligence in education: a systematic literature review
”,
Expert Systems with Applications
, Vol. 
2
, 124167, doi: .
Watson
,
S.
and
Romic
,
J.
(
2024
), “
ChatGPT and the entangled evolution of society, education, and technology: a systems theory perspective
”,
European Educational Research Journal
, Vol. 
24
No. 
2
, pp. 
205
-
224
, doi: .

Languages

or Create an Account

Close Modal
Close Modal