We conduct this systematic literature review to propose a conceptual structure for the Artificial Intelligence in Education (AIED) field. By analyzing how research themes organize, interconnect, and evolve, we overcome the keyword limitations of past studies to clarify the field's current state and managerial implications.
Following the PRISMA protocol, we analyzed 127 articles from Web of Science and Scopus. We extracted bigrams directly from abstracts and used Bibliometrix for a thematic mapping analysis, classifying them into four quadrants by their centrality and density.
We identified ten thematic clusters constituting the AIED conceptual structure. The field has rapidly expanded since 2020, significantly influenced by the public release of ChatGPT. Our results highlight the tensions between technological efficiency and pedagogical concerns.
Our review is limited to articles indexed in Web of Science and Scopus.
We propose managerial implications for defining policies on adoption, ethical use and maintenance of AI systems in schools and educational environments. This includes establishing institutional AIED management policies with a multidisciplinary committee for supervising implementation and conducting “AI literacy” programs for managers and faculty.
Our study offers a unique and more comprehensive understanding of the AIED conceptual structure by moving beyond traditional keyword analysis. We apply a novel thematic mapping approach, revealing implicit conceptual relationships within literature. This allows for a deeper analysis of the field's evolution and interconnected themes, leading to suggested managerial implications regarding AIED adoption and use.
1. Introduction
Artificial intelligence (AI) has impacted the daily life of society, as it has substantially altered human interactions with technology in both personal and professional activities (Lee and Kwon, 2024). AI has advanced in various fields, such as industries, business, education, design (art) and science in general (Ng et al., 2021). More specifically, artificial intelligence in education (AIED) is no longer a theoretical promise or a pilot project from science laboratories; it is now an expanding field with solutions capable of holding realistic conversations with humans and encompassing practical applications that directly influence how students learn, how teachers instruct and how institutions manage educational processes (Bozkurt and Bae, 2024; Cheng and Wang, 2023; Chung and Jeong, 2024; Cumming, 1998; Wang and Cheng, 2021). From personalized learning platforms and intelligent tutoring systems to the automation of assessments and administrative streamlining, the AIED ecosystem is growing in scope and complexity (Bozkurt, 2023; Fu et al., 2024; Sun et al., 2021).
However, this transformation presents conceptual, ethical and cultural challenges; for instance, AI tools have exhibited several issues regarding ingrained bias and discrimination (Baker and Hawn, 2022; Gauthier et al., 2022). Although AI's capacity to process vast amounts of data and deliver nearly instantaneous and adaptive responses is widely acknowledged, concerns persist about how these tools operate at different educational levels (Adams et al., 2023; Chung and Jeong, 2024). Concern also exists regarding how AI will be integrated into public policies directed at education – the recent executive order in the United States of America (USA) serves as a prime example, as it promotes AI integration in education, including task forces and training recommendations for educators (The White House, 2025).
Literature reviews in AIED are undergoing substantial development, as this subject is revolutionizing educational systems and their management methods. Therefore, this systematic literature review aims to identify articles on the AIED field and propose a conceptual structure from them. This objective will be achieved through the thematic mapping of bigrams extracted from the article abstracts. Thus, it is expected to describe how AIED research themes are organized, interconnected and developed within the field, thereby providing a structured view of its current state. In this article, the conceptual structure is defined as the cognitive organization of the field's relationships among topics – specifically “what science talks about” regarding main themes and trends – which is modeled by transforming a co-occurrence keyword network into a two-dimensional thematic map of the domain's typological themes (Aria et al., 2022; Aria and Cuccurullo, 2017; Cobo et al., 2011).
However, distinctions exist between types of literature reviews, which vary according to the research scope, review protocol and data analysis. To summarize these methodological and thematic differences, a brief survey of recent AIED literature reviews reveals key studies and their relation to the work presented in this article.
In the study by Gauthier et al. (2022), it was shown that there are very few – only two – AIED technological tools that genuinely seek to reduce data biases that assist teachers in the classroom; that is, it can be observed in numerous discussions that biases should be considered and addressed, but few applications actually focus on this aspect. The review by Crompton and Burke (2023) was focused solely on higher education, and the work by Mustafa et al. (2024) aimed to evaluate other review articles, presenting an overview of secondary data generated by other authors. The study by Fu et al. (2024) was concentrated on educational levels – PreK-12 (basic education) and higher education – embracing the differences and similarities among the three main themes identified: teaching, learning and administration. However, this evaluation is made descriptively and based on the keywords provided by the authors, as well as on the intellectual structure of the articles.
While the work of Wang et al. (2024) is also a systematic review of the AIED landscape, differences can be identified in its methodological scope and analytical focus. In the former study, the search was limited to the Web of Science database, and the analysis was focused on the theoretical aspects of tool development. In contrast, for the present study, publications from both Web of Science and Scopus are combined, and the broader thematic clusters from abstract bigrams are analyzed.
Distinct from previous reviews, this study introduces a novel approach: bigrams are extracted from the article abstracts and subjected to a four-quadrant thematic mapping, based on the centrality and density of each term to assess the entire AIED field and its implicit conceptual relationships. This method goes beyond the limitations of author-assigned keywords or those automatically generated by the Web of Science database, offering a more comprehensive assessment of the AIED field's implicit relationships. Rather than categorizing the field into broad functional domains (e.g., administration and teaching), this analysis reveals a network of ten interconnected clusters, capturing emerging tensions between generative technologies and pedagogical ethics that remain obscured in keyword-centric analyses.
2. Methods
This study employs a systematic literature review (SLR) that collects and synthesizes scientific research on identifying potential AIED categories and their connections, emphasizing the structure and discussion topics. Although full-text analysis offers specific pedagogical and cognitive details, abstracts were selected to capture the field's broad themes and evolution rather than technical specifics. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) protocol was implemented: it is a guideline that supports all stages of the review process, ensuring transparency and providing methods for authors to report the research motivations, the search process and the findings (Page et al., 2021a, b).
The PRISMA protocol is accompanied by a standardized four-phase flow diagram – encompassing the steps of Identification, Screening, Eligibility and Inclusion – which maps the number of records retrieved, assessed, excluded (along with their respective reasons) and, finally, included in the review, thereby providing an easy-to-understand visual complement of the article selection process (Page et al., 2021a, b). One of its recent tools developed for this purpose is the “PRISMA 2020 Checklist”, which includes 27 items that include recommendations, from the introduction and methods to the presentation of results and discussion, and therefore detailing how topics should be organized and addressed (Page et al., 2021a, b).
The bibliometric analyses were conducted in April 2025 using the R programming language (version 4.4.3) and the Bibliometrix package (version 4.3.3). This widely known, open-source R language tool supports data import from multiple databases, such as Scopus, Web of Science, PubMed, among others. The Bibliometrix web interface, “biblioshiny,” was also applied, as it is an exploration and visualization extension requiring no programming or code, designed to facilitate the analysis and visualization of bibliometric analyses for the scientific community (Aria and Cuccurullo, 2017).
While Bibliometrix is a consolidated tool in the field, we acknowledge that other solutions such as VOSviewer, CiteSpace or the AI-integrated pyBibX (Pereira et al., 2025) offer complementary perspectives. However, Bibliometrix was selected as the sole instrument because it uniquely integrates the thematic map analysis based on abstract bigrams – essential for mapping the conceptual structure targeted by this study – within a transparent and intuitive, code-based data processing pipeline. Unlike VOSviewer, which excels in network visualization quality but lacks robust functions for harmonizing and removing duplicates in heterogeneous datasets (e.g., combining Web of Science and Scopus metadata), Bibliometrix offers a comprehensive data preprocessing pipeline. Furthermore, while CiteSpace provides powerful metrics, its GUI-based workflow presents a steeper learning curve and offers less transparency than the script-based reproducibility of R. We recognize that Bibliometrix currently lacks integration with advanced AI features, such as the embedding vectors and Large Language Models (LLMs) found in pyBibX. This represents a limitation of the current study and a promising avenue for future triangulation.
Based on literature coverage, compatibility and access, two main databases were used: Web of Science (Clarivate) and Scopus (Elsevier). The following keyword combinations were created with Boolean operators (“AND” and “OR”) to achieve the study's objective: (“ai” OR “artificial intelligence”) AND edu* AND AIED. The inclusion of the acronym “AIED” as a search term was determined by its established usage as the standard designation for this domain, distinguishing dedicated educational intelligence research from general AI studies with incidental educational mentions. The inclusion and exclusion criteria were defined as follows: (1) articles only, excluding books and conference publications; (2) articles must be published in English; (3) search terms must appear in the title, abstract and/or keywords (Web of Science Topic and Scopus TITLE-ABS-KEY). The publication year was not limited due to the interest in assessing the evolution of AIED since its inception. Table 1, next, presents all the aforementioned search information.
Databases, search dates and search strings
| Databases | Search dates | Search strings |
|---|---|---|
| Web of Science | 03-31-2025 | (“ai” OR “artificial intelligence”) AND educat* AND AIED (Topic) and Article (Document Types) and English (Languages) |
| Scopus | 03-31-2025 | TITLE-ABS-KEY ((“ai” OR “artificial intelligence”) AND educat* AND aied) AND (LIMIT-TO (DOCTYPE, “ar”)) AND (LIMIT-TO (LANGUAGE, “English”)) |
| Databases | Search dates | Search strings |
|---|---|---|
| Web of Science | 03-31-2025 | (“ai” OR “artificial intelligence”) AND educat* AND AIED (Topic) and Article (Document Types) and English (Languages) |
| Scopus | 03-31-2025 | TITLE-ABS-KEY ((“ai” OR “artificial intelligence”) AND educat* AND aied) AND (LIMIT-TO (DOCTYPE, “ar”)) AND (LIMIT-TO (LANGUAGE, “English”)) |
The identification phase, including the definition of search strings and inclusion/exclusion criteria, was conducted jointly by the authors based on objective parameters and domain expertise. Subsequently, to ensure reliability, the screening of titles and abstracts and the full-text eligibility assessment were performed independently by the authors. Any discrepancies regarding article selection were resolved through discussion until full consensus was reached for each phase.
2.1 Data analysis
Descriptive analyses were conducted to identify key information, including the publication period, the number of articles, the sources (journals), the growth rate and authorship patterns. Scientific production over the years also shows the annual evolution of the number of articles produced. Finally, within the descriptive category, the total number of article citations is presented and sorted by country, with the indication of where the most relevant authors on this topic of discussion are located.
Thematic mapping is a method that combines a clustering algorithm with a keyword network. It uses extracted terms, classifies them into groups, and positions them according to the degree of relevance (centrality on the X-axis) and degree of development (density on the Y-axis) to identify domain themes (Alkhammash, 2023; Aria and Cuccurullo, 2017). Centrality measures the intensity of external links between a cluster and other themes in the network, reflecting its importance to the entire research field. Density quantifies the strength of internal links among the terms within a cluster, indicating the theme's level of development.
The interpretation of the map is presented by Cobo et al. (2011), which divides the graph into four quadrants with four different themes:
Motor themes (upper right quadrant): They have high density and centrality, are well-developed topics, and remain important to the research area.
Niche themes (upper left quadrant): They have high density and low centrality, are well-developed subjects but with limited importance, and are relevant only in specific discussions.
Emerging or declining themes (lower left quadrant): They have low centrality and low density, indicating underdeveloped and marginal themes, which represent early-stage emerging themes or themes that are in the process of disappearing.
Basic themes (lower right quadrant): They have high centrality and low density, representing important themes for the area that have not yet been developed.
Figure 1 provides a graphical representation of how the co-occurrence network is transformed into a thematic map of its respective themes, enhancing the understanding of thematic mapping.
The figure consists of two connected panels arranged from left to right. On the left side is a network labeled “words’ co-occurrence graph”. The network contains clusters of colored circular nodes connected by lines. The clusters include green nodes at the top, blue nodes on the left, purple nodes on the right, and red nodes at the bottom. The nodes are interconnected within and across clusters. Curved dashed lines extend from selected clusters in the network toward the right panel. A purple dashed line arcs from the purple cluster toward a point labeled “T subscript 1”. A blue dashed line curves from the blue cluster toward a point labeled “T subscript 2”. A red dashed line rises diagonally from the red cluster toward a point labeled “T subscript 3”. A green dashed line extends from the green cluster toward a point labeled “T subscript k”. On the right side of the figure is a large square titled “strategic diagram”. The square is divided into four quadrants by a vertical and a horizontal axis. The vertical axis is labeled “density (development degree)”, with a plus sign at the top and a minus sign at the bottom. The horizontal axis is labeled “centrality (relevance degree)”, with a minus sign on the left and a plus sign on the right. In the upper left quadrant is a rectangle labeled “NICHE TOPICS”, with a red circular marker labeled “T subscript 3” positioned above it. In the upper right quadrant is a rectangle labeled “HOT TOPICS”, with a purple circular marker labeled “T subscript 1”. In the lower left quadrant is a rectangle labeled “PERIPHERAL TOPICS”, with a green circular marker labeled “T subscript k”. In the lower right quadrant is a rectangle labeled “BASIC TOPICS”, with a blue circular marker labeled “T subscript 2”. Each colored topic marker corresponds to the dashed line of the same color originating from the word co-occurrence graph on the left.Construction of the thematic map from the words' co-occurrence graph. Source(s): Aria et al. (2022)
The figure consists of two connected panels arranged from left to right. On the left side is a network labeled “words’ co-occurrence graph”. The network contains clusters of colored circular nodes connected by lines. The clusters include green nodes at the top, blue nodes on the left, purple nodes on the right, and red nodes at the bottom. The nodes are interconnected within and across clusters. Curved dashed lines extend from selected clusters in the network toward the right panel. A purple dashed line arcs from the purple cluster toward a point labeled “T subscript 1”. A blue dashed line curves from the blue cluster toward a point labeled “T subscript 2”. A red dashed line rises diagonally from the red cluster toward a point labeled “T subscript 3”. A green dashed line extends from the green cluster toward a point labeled “T subscript k”. On the right side of the figure is a large square titled “strategic diagram”. The square is divided into four quadrants by a vertical and a horizontal axis. The vertical axis is labeled “density (development degree)”, with a plus sign at the top and a minus sign at the bottom. The horizontal axis is labeled “centrality (relevance degree)”, with a minus sign on the left and a plus sign on the right. In the upper left quadrant is a rectangle labeled “NICHE TOPICS”, with a red circular marker labeled “T subscript 3” positioned above it. In the upper right quadrant is a rectangle labeled “HOT TOPICS”, with a purple circular marker labeled “T subscript 1”. In the lower left quadrant is a rectangle labeled “PERIPHERAL TOPICS”, with a green circular marker labeled “T subscript k”. In the lower right quadrant is a rectangle labeled “BASIC TOPICS”, with a blue circular marker labeled “T subscript 2”. Each colored topic marker corresponds to the dashed line of the same color originating from the word co-occurrence graph on the left.Construction of the thematic map from the words' co-occurrence graph. Source(s): Aria et al. (2022)
The thematic mapping analysis presented in this article utilizes bigrams from the abstracts of the selected articles to identify the conceptual structure. Prior to extraction, the text was preprocessed by converting all characters to lowercase and removing punctuation, nonalphanumeric characters and common English stop words. Following this, the standard term extraction routine in Bibliometrix was applied, without the use of word stemming. For this analysis, the network was generated using the Walktrap algorithm, which automatically determined the number of clusters. It included the top 200 bigrams that met a minimum frequency threshold of 2 (per thousand), ensuring the exclusion of less significant elements.
3. Results and discussion
3.1 PRISMA diagram
The database search initially retrieved 711 records, of which 467 were obtained from Scopus and 244 from Web of Science. Before the screening phase, 398 records that were not journal articles were excluded, one record was removed for not being in English, and an additional 119 duplicate entries were excluded. Therefore, 193 distinct articles advanced to the initial title and abstract screening, and 54 were considered irrelevant to the scope of the SLR and were excluded; the irrelevant articles contained the search keywords, but did not discuss the subject, as they had other objectives that did not focus on AIED. For example, articles primarily discussing general AI advancements with only shallow mentions of educational applications, or those detailing the historical development of educational theories without discussing AI's uses or integration, were deemed irrelevant. The remaining 139 articles underwent a full-text reading eligibility assessment, and 12 studies were excluded: 10 for being outside the scope of artificial intelligence in education, and 2 for insufficient and superficial depth of discussion on the AIED topic, which discussed results from another topic. Finally, 127 studies met all inclusion criteria and were included in the final review. Figure 2 shows the database record identification, application of inclusion and exclusion criteria, screening and final inclusion process, according to the guidelines of the PRISMA flow diagram (Page et al., 2021a, b).
The flow diagram is titled “Identification of studies via databases”. On the left side, three vertical labels indicate stages: “Identification”, “Screening”, and “Included”. Under “Identification”, a rectangular box states “Records identified from databases (n equals 711)”. Below it, the sources are listed as “Scopus (n equals 467)” and “Web of Science (n equals 244)”. A horizontal arrow points to a box on the right labeled “Records removed before screening”, which lists “Not articles (n equals 398)”, “Not written in English (n equals 1)”, and “Duplicates (n equals 119)”. Under “Screening”, a downward arrow from the initial records box leads to “Articles screened (n equals 198)”. A horizontal arrow from this box points to “Articles excluded after title and abstract reading (n equals 54)”. A downward arrow leads to “Articles assessed for eligibility (n equals 139)”. A horizontal arrow from this box points to “Articles excluded after full reading (n equals 12)”, with subcategories “Out of A I E D scope (n equals 10)” and “Insufficient discussion (n equals 2)”. Under “Included”, a final downward arrow leads to the box labeled “Articles included in review (n equals 127)”.PRISMA flow diagram. Source(s): Adapted by the authors (2025) according to Page et al. (2021a)
The flow diagram is titled “Identification of studies via databases”. On the left side, three vertical labels indicate stages: “Identification”, “Screening”, and “Included”. Under “Identification”, a rectangular box states “Records identified from databases (n equals 711)”. Below it, the sources are listed as “Scopus (n equals 467)” and “Web of Science (n equals 244)”. A horizontal arrow points to a box on the right labeled “Records removed before screening”, which lists “Not articles (n equals 398)”, “Not written in English (n equals 1)”, and “Duplicates (n equals 119)”. Under “Screening”, a downward arrow from the initial records box leads to “Articles screened (n equals 198)”. A horizontal arrow from this box points to “Articles excluded after title and abstract reading (n equals 54)”. A downward arrow leads to “Articles assessed for eligibility (n equals 139)”. A horizontal arrow from this box points to “Articles excluded after full reading (n equals 12)”, with subcategories “Out of A I E D scope (n equals 10)” and “Insufficient discussion (n equals 2)”. Under “Included”, a final downward arrow leads to the box labeled “Articles included in review (n equals 127)”.PRISMA flow diagram. Source(s): Adapted by the authors (2025) according to Page et al. (2021a)
3.2 Descriptive analyses
Figure 3 shows that the AIED field is not recent, as articles dating back to 1998 already mention its applications, features and main limitations. The articles came from 72 different journals (sources), among which the most prominent are: International Journal of Artificial Intelligence in Education (20), Education and Information Technologies (8) and Computers and Education: Artificial Intelligence (7). Approximately 20% of the articles were written by a single author (26 out of 127), and a low rate of international co-authorship of only 15% was observed.
The dashboard is composed of twelve rectangular panels arranged in three rows and four columns. Each panel contains a label, a numerical value, and a small icon positioned on the right side of the panel. In the first row, from left to right, the panels read: “Timespan” with the value “1998:2025” and an hourglass icon. “Sources” with the value “72” and a stacked document icon. “Documents” with the value “127” and a layered papers icon. “Annual Growth Rate” with the value “10.27 percent” and an upward arrow icon. In the second row, from left to right, the panels read: “Authors” with the value “392” and a single user silhouette icon. “Authors of single-authored docs” with the value “26” and a tool icon. “International Co-Authorship” with the value “15.75 percent” and a globe icon. “Co-Authors per Doc” with the value “3.76” and a group of people icon. In the third row, from left to right, the panels read: “Author’s Keywords (D E)” with the value “473” and an “A B” letters with a checkmark icon. “References” with the value “5269” and a document page icon. “Document Average Age” with the value “3.03” and a calendar icon. “Average citations per doc” with the value “31.72” and a speaker icon.Article's main information. Source(s): Developed by the authors (2025) from information generated by the Bibliometrix package
The dashboard is composed of twelve rectangular panels arranged in three rows and four columns. Each panel contains a label, a numerical value, and a small icon positioned on the right side of the panel. In the first row, from left to right, the panels read: “Timespan” with the value “1998:2025” and an hourglass icon. “Sources” with the value “72” and a stacked document icon. “Documents” with the value “127” and a layered papers icon. “Annual Growth Rate” with the value “10.27 percent” and an upward arrow icon. In the second row, from left to right, the panels read: “Authors” with the value “392” and a single user silhouette icon. “Authors of single-authored docs” with the value “26” and a tool icon. “International Co-Authorship” with the value “15.75 percent” and a globe icon. “Co-Authors per Doc” with the value “3.76” and a group of people icon. In the third row, from left to right, the panels read: “Author’s Keywords (D E)” with the value “473” and an “A B” letters with a checkmark icon. “References” with the value “5269” and a document page icon. “Document Average Age” with the value “3.03” and a calendar icon. “Average citations per doc” with the value “31.72” and a speaker icon.Article's main information. Source(s): Developed by the authors (2025) from information generated by the Bibliometrix package
The information related to the publication timespan in Figure 3 should be assessed carefully, as, although there is an annual growth rate of 10.27%, the document average age is only 3 years, indicating that most were published recently. Figure 4 details this information and shows the actual scenario: the annual rate was diluted over the assessment period, with substantial growth starting in 2020/2021, nearly doubling between 2023 (24 articles) and 2024 (45 articles). The decline in production in 2025, as shown in Figure 4, occurred because the database search for the systematic review was completed at the end of March.
The horizontal axis is labeled “Year” and includes years from 1998 through 2025 in increments of three years. The vertical axis is labeled “Number of Articles” and ranges from 0 to 50 in increments of 5 units. From 1998 to approximately 2015, the number of articles remained between 0 and 1 per year, with small fluctuations. In 2016, the value increases to 6, then drops back to 0 in 2017. From 2018 to 2020, values remained low, between 0 and 3. Beginning in 2021, the number increases to 11. In 2022, the value is 16, marked with a red callout labeled “Chat G P T”. In 2023, the value rises to 24. In 2024, the number peaks at 45. In 2025, the value decreases to 14. Each year is represented by a circular marker connected by a line. Note: All numerical data values are approximated.Annual scientific production. Source(s): Developed by the authors (2025), with information generated by the Bibliometrix package
The horizontal axis is labeled “Year” and includes years from 1998 through 2025 in increments of three years. The vertical axis is labeled “Number of Articles” and ranges from 0 to 50 in increments of 5 units. From 1998 to approximately 2015, the number of articles remained between 0 and 1 per year, with small fluctuations. In 2016, the value increases to 6, then drops back to 0 in 2017. From 2018 to 2020, values remained low, between 0 and 3. Beginning in 2021, the number increases to 11. In 2022, the value is 16, marked with a red callout labeled “Chat G P T”. In 2023, the value rises to 24. In 2024, the number peaks at 45. In 2025, the value decreases to 14. Each year is represented by a circular marker connected by a line. Note: All numerical data values are approximated.Annual scientific production. Source(s): Developed by the authors (2025), with information generated by the Bibliometrix package
Discussions about AI as we witness today have existed since 2015, but it was only in 2019 that the unbridled leap occurred with the public launch of ChatGPT, which was free to use. It is worth mentioning that basic AI techniques – such as natural language processing and neural networks – have been discussed since before the year 2000, but have only recently become popular and accessible (Bishop, 1994; Lund and Wang, 2023; Manning and Schütze, 1999). The oldest article included in this review is that written by Cumming (1998), which mentions the potential of AI in education, its difficulties and categorization issues; several parts emphasize concerns about the appropriateness of the label “artificial intelligence,” which referred to the difference between a technical viewpoint and actual cognitive thought, as well as discussions to change it to terms like “learning systems” or “intelligent learning systems.” Today, it can be stated that the term AIED is appropriate for what it offers and that these initial ideas were correct in recognizing that it is a field with numerous educational applications, both in direct educational roles and in the administration of the educational sector itself.
Regarding the relevance of scientific production by country (Figure 5), the dominance of China (616 citations) and the United States (549) is notable, both leading by a wide margin over South Korea (298) and the United Kingdom (354). It may also seem surprising that developing countries are on the list of the most relevant, such as Kazakhstan (85), Brazil (85), India (43) and Serbia (41).
The horizontal axis is labeled “Number of Citations” and ranges from 0 to 700 in increments of 100 units. The vertical axis is labeled “Countries”. Each country is represented by a horizontal bar with the citation value printed at the end of the bar. From top to bottom, the countries and their citation counts are as follows: “GERMANY”: 40, “SERBIA”: 41, “INDIA”: 43, “SWITZERLAND”: 56, “HONG KONG”: 60, “SWEDEN”: 82, “KAZAKHSTAN”: 85, “BRAZIL”: 85, “SPAIN”: 100, “CANADA”: 145, “FINLAND”: 185, “AUSTRALIA”: 186, “KOREA”: 298, “UNITED KINGDOM”: 354, “USA”: 549, and “CHINA”: 616.Most cited countries. Source(s): Developed by the authors (2025), with information generated by the Bibliometrix package
The horizontal axis is labeled “Number of Citations” and ranges from 0 to 700 in increments of 100 units. The vertical axis is labeled “Countries”. Each country is represented by a horizontal bar with the citation value printed at the end of the bar. From top to bottom, the countries and their citation counts are as follows: “GERMANY”: 40, “SERBIA”: 41, “INDIA”: 43, “SWITZERLAND”: 56, “HONG KONG”: 60, “SWEDEN”: 82, “KAZAKHSTAN”: 85, “BRAZIL”: 85, “SPAIN”: 100, “CANADA”: 145, “FINLAND”: 185, “AUSTRALIA”: 186, “KOREA”: 298, “UNITED KINGDOM”: 354, “USA”: 549, and “CHINA”: 616.Most cited countries. Source(s): Developed by the authors (2025), with information generated by the Bibliometrix package
3.3 Thematic mapping
Figure 6 shows the thematic map, classifying the clusters according to the degree of relevance (centrality) and the degree of development (density). Ten clusters are proposed, which are detailed in Table 2, containing the classification, identification and main terms with their respective frequencies.
The horizontal axis is labeled “Relevance degree (Centrality)”. The vertical axis is labeled “Development degree (Density)”. A dashed horizontal and vertical line divides the map into four quadrants. The upper left quadrant is labeled “Niche Themes”. In this quadrant, the light pink cluster labeled “image recognition”, “recognition technology”, and “sensor technology” appears. Also in the upper left area are the brown clusters “deep learning”, “language context”, and “language processing”. Near the center, above the horizontal dashed line and slightly left of the vertical dashed line, appears the dark pink cluster “educational a i”, “a i learning”, and “educational policies”. Near the center, to the right of the vertical dashed line and slightly above the horizontal dashed line, appears the grey cluster “education research”, “learning sciences”, and “integrating a i”. The upper right quadrant is labeled “Motor Themes”. In this quadrant appear the orange cluster “generative a i”, “learning experiences”, and “critical thinking”. The lower right quadrant is labeled “Basic Themes”. In this quadrant appear the pink cluster “artificial intelligence”, “education a i e d”, and “a i e d research”, the blue cluster “learning analytics”, “learning environment”, and “learning technologies”. The purple cluster “intelligent tutoring”, “tutoring systems”, and “educational data” appears slightly above these but below the horizontal dashed line. The lower left quadrant is labeled “Emerging or Declining Themes”. In this quadrant appear the green cluster “language model” and “a i chatbot”, and also the light green cluster “digital literacy”.Thematic map. Source(s): Developed by the authors (2025), with information generated by the Bibliometrix package
The horizontal axis is labeled “Relevance degree (Centrality)”. The vertical axis is labeled “Development degree (Density)”. A dashed horizontal and vertical line divides the map into four quadrants. The upper left quadrant is labeled “Niche Themes”. In this quadrant, the light pink cluster labeled “image recognition”, “recognition technology”, and “sensor technology” appears. Also in the upper left area are the brown clusters “deep learning”, “language context”, and “language processing”. Near the center, above the horizontal dashed line and slightly left of the vertical dashed line, appears the dark pink cluster “educational a i”, “a i learning”, and “educational policies”. Near the center, to the right of the vertical dashed line and slightly above the horizontal dashed line, appears the grey cluster “education research”, “learning sciences”, and “integrating a i”. The upper right quadrant is labeled “Motor Themes”. In this quadrant appear the orange cluster “generative a i”, “learning experiences”, and “critical thinking”. The lower right quadrant is labeled “Basic Themes”. In this quadrant appear the pink cluster “artificial intelligence”, “education a i e d”, and “a i e d research”, the blue cluster “learning analytics”, “learning environment”, and “learning technologies”. The purple cluster “intelligent tutoring”, “tutoring systems”, and “educational data” appears slightly above these but below the horizontal dashed line. The lower left quadrant is labeled “Emerging or Declining Themes”. In this quadrant appear the green cluster “language model” and “a i chatbot”, and also the light green cluster “digital literacy”.Thematic map. Source(s): Developed by the authors (2025), with information generated by the Bibliometrix package
Thematic map clusters
| Theme classification | Cluster identification | Main terms (occurrences) |
|---|---|---|
| Motor | Cluster 1 (Orange) | Generative ai (14); learning experiences (11); critical thinking (8); educational landscape (8); ethical considerations (8); data privacy (7); educational settings (6); English language (6); academic integrity (5); educational contexts (5); professional development (5); personalize learning (4); teacher education (3); teaching practice (3); technology integration (3); (3); language teaching (3); teachers perspectives (3) |
| Basic/Motor | Cluster 2 (Red) | Artificial intelligence (106); education aied (73); aied research (11); personalized learning (9); students learning (8); collaborative learning (6); AI ethics (5); aied systems (6); ethical principles (4); aied applications (5); educational practices (5); educational systems (5) |
| Motor (Close to Center) | Cluster 3 (Gray) | education research (4); learning sciences (4); integrating ai (4); learning processes (4); transformative potential (4); human intelligence (3) |
| Basic | Cluster 4 (Blue) | Learning analytics (7); learning environment (7); learning technologies (4); lifelong learning (3) |
| Basic | Cluster 5 (Purple) | Intelligent tutoring (6); tutoring systems (6); learning process (5); educational data (5); adaptive learning (5); data mining (5); learning systems (5); learning mode (2); mobile learning (2) |
| Niche | Cluster 6 (Pink) | Educational AI (7); AI learning (3); educational policies (3); AIED incorporation (2); learning data (2); instructional design (3) |
| Niche | Cluster 7 (Brown) | Deep learning (3); language context (3); language processing (3); learning settings (3); natural language (3); sentiment analysis (2); evaluation matrix (2); Google Bert (2) |
| Niche | Cluster 8 (Light Orange) | Image recognition (2); recognition technology (2); sensor technology (2) |
| Emerging or Declining | Cluster 9 (Light Blue) | Digital literacy (2) |
| Emerging or Declining | Cluster 10 (Light Green) | Language model (4); AI chatbot (4) |
| Theme classification | Cluster identification | Main terms (occurrences) |
|---|---|---|
| Motor | Cluster 1 (Orange) | Generative ai (14); learning experiences (11); critical thinking (8); educational landscape (8); ethical considerations (8); data privacy (7); educational settings (6); English language (6); academic integrity (5); educational contexts (5); professional development (5); personalize learning (4); teacher education (3); teaching practice (3); technology integration (3); (3); language teaching (3); teachers perspectives (3) |
| Basic/Motor | Cluster 2 (Red) | Artificial intelligence (106); education aied (73); aied research (11); personalized learning (9); students learning (8); collaborative learning (6); |
| Motor (Close to Center) | Cluster 3 (Gray) | education research (4); learning sciences (4); integrating ai (4); learning processes (4); transformative potential (4); human intelligence (3) |
| Basic | Cluster 4 (Blue) | Learning analytics (7); learning environment (7); learning technologies (4); lifelong learning (3) |
| Basic | Cluster 5 (Purple) | Intelligent tutoring (6); tutoring systems (6); learning process (5); educational data (5); adaptive learning (5); data mining (5); learning systems (5); learning mode (2); mobile learning (2) |
| Niche | Cluster 6 (Pink) | Educational |
| Niche | Cluster 7 (Brown) | Deep learning (3); language context (3); language processing (3); learning settings (3); natural language (3); sentiment analysis (2); evaluation matrix (2); Google Bert (2) |
| Niche | Cluster 8 (Light Orange) | Image recognition (2); recognition technology (2); sensor technology (2) |
| Emerging or Declining | Cluster 9 (Light Blue) | Digital literacy (2) |
| Emerging or Declining | Cluster 10 (Light Green) | Language model (4); |
The main theme is cluster 1 (orange color). The high frequency of “generative ai” and “critical thinking,” paired with strong centrality and density, marks this as a motor theme, the field's most developed and relevant cluster. It discusses the use of generative AI to assist in the learning process, for both students and teachers, and to develop critical thinking. This cluster focuses on the teaching process and on the impacts that generative AI can generate, for example: how the educational structure (“educational landscape”, “educational settings”, “educational contexts,” among others) will be affected or can be adapted to incorporate these innovations. The discussions include the general impact on civilization, the crossroads that education faces regarding the present and future, pedagogical beliefs, acceptability, ethics, privacy and the practical use of AI in daily educational activities. There is also a strong presence of educators' opinions on how their work will be affected, their acceptance and the critical process of implementing these tools.
Cluster 2 (red color) contains the two main terms from the systematic literature review: “artificial intelligence” (“artificial intelligence”) and “AIED education” (“education AIED”). However, according to the analysis, it is situated between basic and motor themes, not being developed enough to be considered a fully motor theme. This aspect may have occurred due to the high mention of the main terms without knowledge of construction patterns that were sufficiently relevant or considerable in relation to the other terms in the cluster. Several similarities can be observed with the previous cluster, but there is a notable increase in the generality of the terms used; the broad area of “artificial intelligence” is used instead of its generative classification; another example is the presence of AIED branches (“education AIED,” “AIED systems” and “AIED applications”), which highlights the use of the acronym for this type of tool, whereas cluster 1 demonstrates more concerns regarding educational competencies.
Cluster 3 (gray color) discusses future expectations for AIED, both at levels closer to reality, with AI tutors working alongside traditional teaching, as well as for currently utopian cybernetic environments that would integrate futuristic aspects of AI, highlighting that humans need to acquire skills to deal with these scenarios (Baillifard et al., 2025; Burleson and Lewis, 2016). It can also be mentioned that a middle ground of hybrid teaching between humans and machines would need to be introduced into the educational context to empower and prepare humans for this mixed future (Cukurova, 2025). Similarly, cluster 4 (blue color) is a branch of cluster 3, focused on the historical differences of the last 25 years and on how they are evolving toward the future, encompassing more specific variables of this evolution: the market shifting toward an AIED application service delivery ecosystem; concerns about user data ethics and professional goals focused on skills development (Dillenbourg, 2016; Nye, 2016; Pammer-Schindler and Rosé, 2022).
AI can operate in traditional teaching by automatically generating interdisciplinary content, combining basic information for the proposed context and adapting it for each student in a fully personalized manner (Bozkurt, 2023). The teaching of new languages stands out as a sector, as generative AI can provide instantaneous responses, in text and voice, on students' writing and speech, an indispensable aspect when learning a new language; it can also help adapt assessment methods (Chung and Jeong, 2024; Sun et al., 2021). From an educator's perspective, it can help alleviate the workload of creating repetitive assessment materials, graphics adapted to the subject, grammatical correction of essays and even summarize themes from documents or long texts related to educational bureaucracy (Chung and Jeong, 2024; Estaiteyeh and McQuirter, 2024; Koraishi and Karatepe, 2025). A different branch of education that deserves attention is the integration of AIED with motor skill-related learning, where AI solutions already have technological potential but require research and integration with tools that promote instantaneous and adapted tactile responses – vibrations, forces or movements – to the student (Santos, 2016).
Regarding societal shifts, Bozkurt and Bae (2024) argue that the major impact on civilization stems from AI's ability to hold conversations like a human being, as the very act of communication through manipulating and generating words is one of the foundations of human culture. Concern arises from the change in the origin of educators, shifting from something human and organic to something mechanical and synthetic that will affect future generations: how will students raised in this environment be impacted (Bozkurt, 2023)? The comparison made by Koraishi and Karatepe (2025) illustrates this paradigm: machines summarize information in a fraction of the time it takes humans for the same activity; however, they cannot capture the nuances of the pedagogical classroom context, such as pedagogical judgment. Although AI is very efficient in delivering requests, the same cannot be said about its effectiveness concerning the objective of education, as it does not comprehend human values and contexts beyond the data obtained from the Internet, struggling to assess pedagogical needs in a classroom, and may even provide a distorted view of reality (Bozkurt et al., 2024). Efficiency or effectiveness metrics may be insufficient, as there is also the issue of human acceptance of generative AI use by educators themselves with their pedagogical beliefs, with attempts to predict acceptability through demographic variables and teaching characteristics, indicating that constructivist profiles are more prone to incorporate AIED in the day-to-day classroom (Cabero-Almenara et al., 2024a, b).
A prominent topic mentioned in both clusters is the issue of ethics. It is necessary to identify and highlight the main ethical points to be addressed in AI in Education (AIED), along with their differences depending on the educational level; a notable example is proposed by Adams et al. (2023), who recommend four core principles for AI solutions in K-12 education: pedagogical adaptation, children's rights, literacy and educator well-being. There may also be classifications to shape how AI should be viewed in the teaching process, ranging from “the student as a total receiver of information” to “being responsible or a leader in the learning process” (Ouyang and Jiao, 2021). In the same context, there are concerns regarding the lack of educational policies and stances on AIED (Bozkurt et al., 2024); society must question what type of education is being encouraged by the governmental regulatory structure: AI to assist in already existing and globally well-known education or to adapt education to encompass the use of AI (Schiff, 2022)? Furthermore, concerning the ethics and prejudices of AIED tools, how could different societies assess something so subjective and ingrained in human history? AI tools are expected to possess these unwanted characteristics – prejudices and biases – because they are fed with information available on the Internet provided by users; therefore, it is an inherent matter of human nature that must be managed to prevent various types of discrimination (Baker and Hawn, 2022).
The comparison itself between AI applications is complicated, since AIED performance evaluations tend to centralize, generalize or standardize behaviors – of the tool, the tool creators or the tool targets – to analyze the educational setting and, ultimately, allow extrapolation to other cases that may not have been represented in experimental or quasi-experimental tests (Blanchard, 2015; Ocumpaugh et al., 2024; Rahm and Rahm-Skågeby, 2023). Lai et al. (2024) assessed whether AIED tools affect students' reaction times in perceiving the emotional context – positive or negative – of sentences or scenarios presented in photographs; the experiment shows correlation but cannot affirm causality, besides representing a specific case from the sample, thus requiring much more research in similar situations around the world to confirm the correlation in different cultural contexts. In this area, experiments that assess – or attempt to predict – the influences of AIED tools on teaching are varied and demand research focused solely on possible groupings and conclusions in this field (Ferguson et al., 2022; Lai et al., 2024; Lin, 2022; Shrivastava, 2023).
Furthermore, curriculum development on the topic plays a notable role in this comparison from the AI perspective, with existing proposals on how these intelligent agents should be studied and developed (Bellas et al., 2023; Bittencourt et al., 2009; Ghnemat et al., 2022; José-García et al., 2023; Ramadevi et al., 2023). The learning process can be structured around problem-based situations anchored in reality, which are designed with progressively increasing difficulty and adaptations tailored to the educational level (Bellas et al., 2023). It is also pointed out that topics related to students' social and emotional learning should be included, since dealing with AI tools also implies understanding the potential prejudices of the agents regarding global demographic characteristics, such as gender, race or ethnicity, nationality, culture, among others (Baker and Hawn, 2022; Bellas et al., 2023). A prime example is the creation of tools to assist in career decision-making, which is already focused on technology jobs: students must understand the tendency already implicit in this type of application, with a clear focus on information technology and the exclusion of other areas, identifying the advantages and disadvantages of this developmental approach (José-García et al., 2023).
Cluster 5 (purple color) focuses on the concept of intelligent tutoring systems (ITS), including their methodologies and the technical characteristics of the proposed tools, such as speech recognition, data mining, multi-agent systems (MAS) and text pattern recognition to create domain ontologies (Hamal et al., 2021; Jiang et al., 2024; Zouaq and Nkambou, 2008). In contrast, Cluster 6 (pink color) emphasizes the specifications for the use of AI in creating educational tools for teaching, indicating the directions of AIED and the applicability of the main tools; this can be considered a niche aspect of Cluster 5 related to intelligent tutoring. Within the broader area of technical subjects related to the functioning of AIED tools, Clusters 7 (brown color) and 8 (light orange color) can be grouped, discussing deep learning, language processing, matrices, Google BERT, image recognition and the use of sensors for data collection.
According to the guidelines, Cheng and Wang (2023) and Wang and Cheng (2021) propose three perspectives on teaching with AI: (1) learning with AI, which includes intelligent teaching systems and platforms that guide students toward learning; (2) learning about AI, which refers to AI literacy and the educational process for teaching humans to work with and benefit from the possibilities offered by AI; and (3) learning with AI, which utilizes AI as a support tool in teaching activities, acting as an assistant that enhances educational activities. These applications can have philosophical dimensions regarding how teaching will be oriented and assessed, utilizing, for example, positive psychology (p-AIED) or careful assessment with adaptations for each student (Bittencourt et al., 2024; Sparks et al., 2024).
In this context, open learner models (OLMs) merit attention; these are interfaces that allow students to visualize their learning performance within the group to which they belong, presenting metrics and enabling personal critical thinking by the users themselves. The benefits include engagement, self-assessment and general learning gains, as well as transparency and scrutiny to understand how metrics are created and assessed; they can even serve as a communication bridge between educators and students (Kay et al., 2022). OLMs can be framed within the three principles of Cheng and Wang (2023) and Wang and Cheng (2021), particularly in learning with AI and learning from AI, since the tool assesses teaching under the educator's supervision, thus representing learning as the outcome of OLMs; the principle of learning about AI is variable, as it depends on the focus of the lesson's subject matter, which may or may not be incorporated (Kay et al., 2022).
The emerging or declining clusters are 9 (light blue color) and 10 (light green color). The former discusses digital literacy, a topic previously mentioned in the form of AI literacy, but which may be evolving into a broad issue that encompasses all these concepts, or losing focus exclusively on AI. The latter discusses the specific use of chatbot solutions in the teaching process, a type of application that may already be included in previous discussions but may be emerging as an area relevant enough to be debated separately.
Figure 7 presents a mind map with a visual summary of the previous results and discussions.
The concept map is centered on a blue circular node labeled “A I E D”. From this central node, multiple colored pathways extend outward. Following the magenta pathway: A line connects “A I E D” to “Framework and Technical Aspects”. From “Framework and Technical Aspects”, lines extend to “Intelligent Tutoring Systems” and to “A I E D Components”. From “A I E D Components”, lines extend to “Image Recognition”, “Deep Learning”, and “Natural Language Processing”. From “Framework and Technical Aspects”, another line extends to “Open Learner Models”, which connects to “Self-assessment”, “Transparency”, and “Engagement”. Following the orange pathway: A line connects “A I E D” to “Emerging or Declining”. From “Emerging or Declining”, lines extend to “Chatbots” and to “Digital Literacy”. Following the red pathway: A line connects “A I E D” to “Curriculum”. From “Curriculum”, lines extend to “Problem-Based”, “Social-Emotional”, and “To or From A I”? Following the pink pathway: A line connects “A I E D” to “Learning Perspectives”. From “Learning Perspectives”, lines extend to “From A I”, “With A I”, and “About A I”. Following the purple pathway: A line connects “A I E D” to “Future Expectation and Evolution”. From this node, lines extend to “Market Changes”, “Hybrid Teaching”, and “Utopia”. Following the yellow pathway: A line connects “A I E D” to “Ethics and Policy”. From “Ethics and Policy”, lines extend to “Data Privacy”, “Bias”, “Policy Gaps”, “Knowledge Owners”, and “Pedagogical Adaptation”. Following the green pathway: A line connects “A I E D” to “A I Potential”. From “A I Potential”, lines extend to “Pedagogical Beliefs and Acceptance”, “Personalized Learning”, “Applications”, “Workload Reduction”, and “Landscape Changes”. From “Applications”, lines extend to “Language Teaching”, “Motor Skills”, “Career Guidance”, and “Educator Support”. Lines extend from “Educator Support” to “Summarization”, “Grammar”, and “Assessment”.Summarization mind map. Source(s): Created by the authors (2025) with the app Mermaidchart
The concept map is centered on a blue circular node labeled “A I E D”. From this central node, multiple colored pathways extend outward. Following the magenta pathway: A line connects “A I E D” to “Framework and Technical Aspects”. From “Framework and Technical Aspects”, lines extend to “Intelligent Tutoring Systems” and to “A I E D Components”. From “A I E D Components”, lines extend to “Image Recognition”, “Deep Learning”, and “Natural Language Processing”. From “Framework and Technical Aspects”, another line extends to “Open Learner Models”, which connects to “Self-assessment”, “Transparency”, and “Engagement”. Following the orange pathway: A line connects “A I E D” to “Emerging or Declining”. From “Emerging or Declining”, lines extend to “Chatbots” and to “Digital Literacy”. Following the red pathway: A line connects “A I E D” to “Curriculum”. From “Curriculum”, lines extend to “Problem-Based”, “Social-Emotional”, and “To or From A I”? Following the pink pathway: A line connects “A I E D” to “Learning Perspectives”. From “Learning Perspectives”, lines extend to “From A I”, “With A I”, and “About A I”. Following the purple pathway: A line connects “A I E D” to “Future Expectation and Evolution”. From this node, lines extend to “Market Changes”, “Hybrid Teaching”, and “Utopia”. Following the yellow pathway: A line connects “A I E D” to “Ethics and Policy”. From “Ethics and Policy”, lines extend to “Data Privacy”, “Bias”, “Policy Gaps”, “Knowledge Owners”, and “Pedagogical Adaptation”. Following the green pathway: A line connects “A I E D” to “A I Potential”. From “A I Potential”, lines extend to “Pedagogical Beliefs and Acceptance”, “Personalized Learning”, “Applications”, “Workload Reduction”, and “Landscape Changes”. From “Applications”, lines extend to “Language Teaching”, “Motor Skills”, “Career Guidance”, and “Educator Support”. Lines extend from “Educator Support” to “Summarization”, “Grammar”, and “Assessment”.Summarization mind map. Source(s): Created by the authors (2025) with the app Mermaidchart
Finally, Table 3, below, presents the summary of the content addressed by each cluster, including the theme and identification of the cluster, along with its main characteristics discussed in the work, and possible managerial implications.
AIED thematic clusters summary
| Themes classification | Clusters identification | Main characteristics | Managerial implications |
|---|---|---|---|
| Motor | Cluster 1 (Orange) | Discusses generative AI's role in learning processes for students and teachers and critical thinking development. Addresses impacts on educational structure, civilization, pedagogical beliefs, acceptance, ethics, privacy, and practical AI use in educational activities |
|
| Basic/Motor | Cluster 2 (Red) | Contains the primary terms “artificial intelligence” and “education AIED” but with more general terminology than Cluster 1. Focuses on AIED branches (systems, applications) and broad educational implications rather than specific competencies |
|
| Motor (centrality and density slightly above average) | Cluster 3 (Gray) | Explores future expectations of AIED, from realistic AI tutors working alongside traditional teaching to futuristic cyber environments. Highlights the need for hybrid teaching approaches to prepare humans for a mixed future |
|
| Basic | Cluster 4 (Blue) | Focuses on learning analytics, environments, technologies, and lifelong learning. Examines historical differences over 25 years and evolution toward the future, including market shifts toward AIED service delivery and ethics of user data |
|
| Basic | Cluster 5 (Purple) | Focuses on intelligent tutoring systems (ITS), emphasizing their design, data, adaptive features and technical components like users' device type, speech recognition and agent-based modeling |
|
| Niche | Cluster 6 (Pink) | Emphasizes specifications for using AI in creating educational tools, indicating AIED directions and applicability. Represents a niche aspect of Cluster 5, focused on general educational tool development beyond ITS. |
|
| Niche | Cluster 7 (Brown) | Technical focus on deep learning, language processing, matrices, and Google BERT. Part of the broader technical subjects regarding the operation of AIED tools |
|
| Niche | Cluster 8 (Light Orange) | Focuses on image recognition and sensor technology for data collection. Grouped with Cluster 7 as part of the technical aspects of AIED tools | |
| Emerging or Declining | Cluster 9 (Light Blue) | Covers digital literacy, which may be broadening to include AI-related topics or fading as a distinct theme in AIED, leaning toward general digital competencies rather than AI specifically |
|
| Emerging or Declining | Cluster 10 (Light Green) | Discusses specific use of chatbot solutions in teaching processes, which may be emerging as a sufficiently relevant area to be debated separately |
|
| Themes classification | Clusters identification | Main characteristics | Managerial implications |
|---|---|---|---|
| Motor | Cluster 1 (Orange) | Discusses generative | Investing in generative Defining internal data governance and privacy guidelines, ensuring compliance with data protection regulations |
| Basic/Motor | Cluster 2 (Red) | Contains the primary terms “artificial intelligence” and “education AIED” but with more general terminology than Cluster 1. Focuses on AIED branches (systems, applications) and broad educational implications rather than specific competencies | Establishing institutional AIED management policies, with a multidisciplinary committee for supervising implementation and related projects Conducting “ |
| Motor (centrality and density slightly above average) | Cluster 3 (Gray) | Explores future expectations of AIED, from realistic | Design and test hybrid teaching models ( Develop educational forecasting models to anticipate competencies required in the labor market. |
| Basic | Cluster 4 (Blue) | Focuses on learning analytics, environments, technologies, and lifelong learning. Examines historical differences over 25 years and evolution toward the future, including market shifts toward AIED service delivery and ethics of user data | Implement learning analytics platforms to continuously monitor key educational indicators (retention, performance, potential ethical deviations, among others) |
| Basic | Cluster 5 (Purple) | Focuses on intelligent tutoring systems ( | Select and implement Establish an evaluation and feedback cycle, integrating |
| Niche | Cluster 6 (Pink) | Emphasizes specifications for using | Define standards and functional requirements for the development of new AIED tools, in partnership with suppliers and/or developers Document AIED tool incorporation processes, creating repositories of best practices and lessons learned |
| Niche | Cluster 7 (Brown) | Technical focus on deep learning, language processing, matrices, and Google | Encourage research and development to improve existing techniques and/or to develop new theoretical and practical methodologies |
| Niche | Cluster 8 (Light Orange) | Focuses on image recognition and sensor technology for data collection. Grouped with Cluster 7 as part of the technical aspects of AIED tools | |
| Emerging or Declining | Cluster 9 (Light Blue) | Covers digital literacy, which may be broadening to include AI-related topics or fading as a distinct theme in AIED, leaning toward general digital competencies rather than | Monitor the evolution of the subject matter to understand the relationship between |
| Emerging or Declining | Cluster 10 (Light Green) | Discusses specific use of chatbot solutions in teaching processes, which may be emerging as a sufficiently relevant area to be debated separately | Monitor the evolution of the topic to understand whether chatbots have relevant applicability in AIED. |
4. Conclusion
The systematic literature review (SLR) successfully mapped the conceptual structure of artificial intelligence in education (AIED), providing an updated discussion on the characteristics and applications of AIED. Regarding the evolution of the AIED subject, publications have rapidly increased since 2020, especially after ChatGPT became freely available to the public; however, initial mentions date back to 1998, establishing the foundations of AIED research. The most relevant articles are from China and the United States, although some developing countries, such as Kazakhstan, Brazil, India and Serbia, also contribute considerably to the discussion.
The thematic mapping analysis identified ten thematic clusters that show the conceptual structure of AIED research and how it is organized. Tensions exist between the technological efficiency of AI and pedagogical concerns, among which are highlighted: ethics, bias, privacy, social-emotional learning, superficial learning, curriculum development and gaps in governmental policies. There are wide differences in thought among scientific productions, ranging from planning for humans to deal with practically utopian AI-dominated scenarios to concerns about whether AIED tools meet expectations of efficiency and effectiveness in education. The central concern is whether these tools are truly beneficial for the objective of satisfactorily educating students, be they children, adolescents or adults. However, not enough is known about how these technologies affect student learning or teacher workload in different countries and cultures, since research mostly comprises case studies with limited scope, without the power to extrapolate to the general population. Ethics remains a noteworthy concern across all research clusters, especially concerning human values and the risk that AIED might reinforce existing discrimination owing to its inherent reliance on AI databases.
This study contributes to the literature by unveiling evidence on how technical aspects relate to pedagogical concerns, suggesting increasingly interdisciplinary research trends. From a methodological standpoint, an innovative approach was applied, with four-quadrant thematic mapping based on the centrality and density of each term, to assess the entire AIED field. For educational management, recommendations are suggested for defining policies on adoption, ethical use and maintenance of AI systems in schools and/or educational environments. It also contributes to a holistic reflection that involves different stakeholders to participate in the ongoing debate, acting both as proponents and as the subjects central to these developments: teachers, students, educational managers, information technology developers and public policymakers, to name but a few. As a result, more and better AIED solutions are expected to emerge, strengthening their adherence to different contexts and, ultimately, the pedagogical quality itself.
Future research calls for more long-term studies that track how AIED tools affect teaching and learning over time. Studies comparing how different countries and cultures implement AIED are also suggested, as they could identify key contextual factors that influence adoption and effectiveness across diverse educational systems and cultural settings. This approach would go beyond isolated case studies, contributing to a broader understanding of the global implications of AIED.

