This study aims to examine whether increasing AI literacy necessarily leads to safer and more responsible engagement with generative AI in higher education, and to introduce the concept of cognitive AI safety.
An explanatory sequential mixed-methods design was used. In Phase 1, survey data were collected from 311 university students to develop and test the cognitive AI safety scale (CAISS) and to examine whether AI literacy predicts cognitive AI safety. In Phase 2, an independent sample of 241 participants was used for confirmatory factor analysis. A qualitative phase then involved cognitive-audit interviews with 12 highly AI-literate participants to explore how verification, trust and decision-making were enacted in academic AI use.
The findings provide initial support for a revised three-factor structure of cognitive AI safety comprising epistemic vigilance, cognitive agency and failure resilience. Quantitative results show that AI literacy is positively associated with cognitive AI safety, though the model explained the relationship only modestly. The exploratory factor analysis supported the theorised three-dimensional structure, while the independent confirmatory factor analysis supported a revised 11-item version of the CAISS. Qualitative findings revealed that highly AI-literate users did not always apply verification. Instead, they checked selectively, guided by task stakes, domain familiarity, perceived effort and personal accountability.
This study introduces cognitive AI safety as a more focused, education-specific construct for understanding safe engagement with generative AI and provides initial validation of the CAISS. By integrating insights from AI literacy, epistemic vigilance, trust calibration and cognitive offloading, the authors reframe AI-related risk in education as a cognitive and pedagogical issue, as well as a technical and ethical one.
Introduction
Generative artificial intelligence (GenAI) is rapidly becoming embedded in academic work in higher education. It is changing how students search for information, draft text and construct arguments. Unlike earlier digital tools that primarily supported retrieval or presentation, systems such as ChatGPT generate ready-made responses that often appear polished, complete and authoritative. This is reflected in recent empirical research. For example, Chia and Frattarola (2025), in a study of a GenAI-enabled brainstorming app for undergraduate writing, found that students used the tool to generate ideas, explore different perspectives and support early-stage ideation, while also expressing reservations about mistrust, relevance of feedback and the limits of GenAI for argumentative writing. In a similar vein, Kohnke et al. (2025) reported that 66.7% of first-year university students regularly used GenAI-related tools and generally perceived them as helpful for writing, grammar, vocabulary and reading, although concerns about overreliance were also noted.
This shift matters because academic work has traditionally involved forms of learning friction, including searching for sources, comparing viewpoints and checking evidence and citations. Such effort often supports deeper analytical engagement. By contrast, GenAI can compress many of these steps into a single interaction, producing persuasive responses with relatively little immediate effort from the user. For this reason, scholars have argued that GenAI may reshape not only academic practices but also the conditions under which students learn, evaluate and write (Chiu, 2023; Dwivedi et al., 2023; Ratten and Jones, 2023).
At the same time, generative AI introduces not only cognitive risks but also potential benefits. In some contexts, it may reduce unproductive effort, scaffold reflection and help students engage with ideas more iteratively, thereby enhancing their overall educational experience. Thus, the key educational question is not whether GenAI is inherently beneficial or harmful, but under what conditions it supports learning without weakening independent judgement. In response, universities and policymakers have expanded efforts to promote AI literacy, with initiatives that typically emphasise a combination of technical skill, ethical awareness and effective tool use (Yusoff et al., 2025).
Recent research on AI-assisted academic writing also shows that responsible use is already a growing concern, particularly regarding over-reliance, originality, transparency, critical thinking and academic integrity. For example, Subaveerapandiyan et al. (2025), in a study of PhD scholars, found that AI tools were valued for improving clarity, revision and efficiency, while also raising concerns about dependence and the possible weakening of independent learning. At the same time, many AI literacy initiatives still tend to assume that increased knowledge and skill will strengthen users’ ability to engage responsibly and critically with AI outputs (Bai, 2025; Wang et al., 2023). However, the expansion of GenAI raises a further concern for education: the risk may not be incorrect tool use alone, but a gradual weakening of independent thinking when fluent machine-generated outputs are accepted with insufficient scrutiny.
Why competence may not automatically lead to safety
How users process information and decide when to trust technology fundamentally shapes human judgement in AI-supported work. One useful perspective is dual-process theory, which distinguishes between relatively fast, intuitive processing and slower, more analytical evaluation (Groves and Thompson, 1970). GenAI systems often encourage intuitive acceptance by presenting outputs in fluent, confident and apparently coherent forms. This can contribute to automation bias, in which users favour machine suggestions over their own judgement even when the system is mistaken (Hoff and Bashir, 2015; Parasuraman and Manzey, 2010). At the same time, appropriate reliance depends on trust calibration, that is, adjusting trust to the demands and uncertainty of the task rather than trusting the system uniformly (Lee and See, 2004).
Current research shows that the notion “more skill = more safety” is insufficient. Users may overestimate their understanding (Langer et al., 2023), disengage when tasks are easy (Sweller, 1994) and activate checking only in specific contexts. Thus, AI literacy can support safer use, but this is not always realised. Technical competence does not guarantee safe engagement with AI. Verification depends not only on knowledge, but also on how users allocate attention, trust and effort in context. This suggests that AI literacy may enable safer use, while leaving open the question of when that capacity is enacted in practice.
What the literature still cannot explain
Most AI safety discussions focus on bias, security and regulation; educational ones emphasise ethics and misconduct. A significant risk in education is that AI-generated content often appears confident and convincing, even when it is incorrect, leading students to trust it too easily. Although many AI literacy programs emphasise ethics and critical evaluation, there is currently limited evidence that these efforts lead to sustained verification practices or to safer use of generative AI. According to Hauswald (2025), there is growing concern that AI systems are being accepted as epistemic authorities despite these challenges.
Although many contemporary AI literacy frameworks already include ethical awareness, critical evaluation and responsible use, empirical evidence remains limited regarding whether such literacy reliably translates into sustained verification and safe cognitive behaviour during actual interaction with generative AI. Literacy is measured more effectively, but it is unclear if higher literacy improves vigilance (Wang et al., 2023). Research on trust shows competence helps judgement, but reliance depends on many factors, including situation and psychological state (Hoff and Bashir, 2015; Lee and See, 2004; Schaefer et al., 2016). The main question is whether AI literacy ensures safer behaviour, or if context and motivation matter more. This uncertainty shapes instruction. It raises questions about how to go beyond superficial awareness and teach or assess continual verification. Educators must decide whether AI literacy is sufficient or whether more vigilance is needed.
There is also a methodological gap in the study of safe AI use. Many studies measure perceptions, such as usefulness, attitudes or self-reported checking (Bergdahl and Sjöberg, 2025; Chiu et al., 2024; Chiu, 2025). Few observe “verification behaviour,” like double-checking or fact-checking AI outputs. Automation bias and trust miscalibration may act unconsciously, so self-reports might overstate vigilance. The field needs research that directly measures cognitive safety and examines when users verify or skip checking during real tasks.
To address these gaps, this study advances the construct of cognitive AI safety, defined as the psychological capacity to maintain epistemic vigilance (verification and scepticism), cognitive agency (resistance to unreflective offloading) and failure resilience (confidence in human judgement and continuity when AI fails) during interaction with generative AI. Cognitive AI safety is not proposed as a replacement for broader constructs such as critical thinking or trust calibration. Rather, it is positioned as a more focused, education-specific construct. Its contribution lies less in proposing a wholly new psychological phenomenon than in integrating established concerns from AI literacy, trust calibration, epistemic vigilance and cognitive offloading into a context-specific construct for GenAI-supported academic work. Critical thinking refers broadly to the evaluation of claims, reasoning and evidence across contexts, whereas trust calibration concerns the appropriate adjustment of reliance in relation to a system’s actual capabilities (Holland et al., 2024). Cognitive AI Safety brings these concerns together at the point of AI-supported academic work by focusing on whether users can sustain scrutiny, retain cognitive control and rely on their own judgement when GenAI outputs are fluent but fallible. In this sense, Cognitive AI Safety refers to a psychological capacity rather than to guaranteed protection against error; safe behaviour depends on when and in what contexts this capacity is activated.
Recent evidence from generative AI research further suggests that awareness does not translate into a single or uniform behavioural response. Dahabiyeh et al. (2026), for example, showed that privacy awareness in ChatGPT use is multidimensional and can either heighten or alleviate perceived risks and concerns depending on what users are aware of, while different dimensions of awareness relate differently to continued use. Although their focus is privacy rather than cognitive safety, their findings reinforce the view that awareness is an enabling condition for safer judgement, but not a guarantee of sustained vigilance across contexts.
This framing also aligns with human factors research showing that safe reliance depends on calibrated trust rather than maximal trust and that reliance decisions are shaped by personal, situational and system factors (Lee and See, 2004; Hoff and Bashir, 2015). It is also consistent with cognitive load theory, which suggests that low-friction environments may alter cognitive engagement by reducing the need for monitoring and selective attention (Sweller, 1994).
The present study moves beyond conceptual critique by providing an initial empirical examination of this relationship. Rather than examining whether AI literacy leads to safety or overconfidence, the study tests its predictive relationship with cognitive AI safety and then investigates the conditions under which verification is activated or deferred during academic AI use. In this sense, safety is treated not as a fixed trait or constant behaviour, but as a capability enacted selectively in context.
Theories, research questions and hypotheses
This study draws on four complementary theoretical perspectives, each explaining a different aspect of safe and unsafe engagement with generative AI. Epistemic vigilance theory (Sperber et al., 2010) explains why users must actively evaluate the plausibility, consistency and credibility of communicated information before accepting it as true. In the context of GenAI, this is especially important because outputs are often fluent and authoritative even when inaccurate. Cognitive load theory (Sweller, 1994) helps explain how AI-supported environments may alter cognitive engagement. The concern is not that all reductions in effort are harmful, but that when generative systems reduce productive engagement with reasoning tasks, users may monitor less carefully or disengage from evaluative processing. Bounded rationality (Simon, 1972) explains why users may choose not to verify every output, even when they can, especially under time pressure or when a “good enough” response appears sufficient. Dual-process theory (Groves and Thompson, 1970) helps interpret the shift between relatively intuitive acceptance and more effortful analytical checking. Taken together, these perspectives suggest that cognitive AI safety depends not only on what users know about AI, but also on how they allocate attention, effort and trust under specific task conditions. The study was guided by the following research questions:
What dimensions constitute cognitive AI safety in the context of generative AI use and can these dimensions be measured reliably using a validated scale?
To what extent does AI literacy predict cognitive AI safety among higher-education users of generative AI?
(Null hypothesis). AI literacy and cognitive AI safety are not significantly related.
(Over-reliance hypothesis). Higher AI literacy predicts lower cognitive AI safety due to less verification and more reliance on AI outputs.
How do users with high levels of AI literacy explain their decisions to accept generative AI outputs without verification?
What factors trigger users to stop trusting generative AI outputs and initiate verification during real-world academic tasks?
The conceptual framework guiding this study positions AI literacy as a key predictor of cognitive AI safety, defined as a multidimensional construct comprising epistemic vigilance, cognitive agency and failure resilience. The framework first assumes that cognitive AI safety can be measured as a reliable psychological capability (RQ1). It then tests whether AI literacy, as measured by technical proficiency and conceptual knowledge, predicts this capability (RQ2), while allowing for competing theoretical expectations regarding over-reliance. Recognising that statistical relationships alone cannot fully explain user behaviour, the framework also incorporates a qualitative component to examine how highly literate users decide when to accept AI outputs without verification (RQ3) and what triggers them to shift from trust to scepticism during real academic tasks (RQ4). In this way, cognitive AI safety is conceptualised not as a constant behaviour but as a capacity enacted selectively in response to context, perceived risk and task demands.
Methodology
This study used an explanatory sequential mixed-methods design (QUAN → QUAL) (Ivankova et al., 2006) to examine the relationship between AI literacy and Cognitive AI Safety in higher education. This was appropriate, as the link between technical competence and safe cognitive behaviour is complex and cannot be understood with a single method. The quantitative phase tested predictive relationships, while the qualitative phase explained how and why these patterns appeared. Integration occurred at the interpretation stage, using qualitative insights to provide context to quantitative findings, especially regarding the relationship between AI literacy and Cognitive AI Safety.
Quantitative study
The quantitative phase was conducted across four higher education institutions in Indonesia and Malaysia, two countries experiencing rapid digital transformation and increasing use of generative AI in academic contexts. These countries were selected for both their proactive national policies promoting digital technology in education and their diverse yet comparable tertiary education systems. The chosen institutions represented a mix of major public research universities and leading private universities, ensuring coverage of a broad spectrum of academic settings and student populations within each country. This selection provided contextual relevance aligned with the research objectives while allowing for comparison across similar but distinct educational environments.
In the first round of data collection, a stratified random sampling approach was used to ensure representation across disciplines (STEM and social sciences), gender and academic level. A total of 311 valid responses were retained for analysis (NIndonesia = 192; NMalaysia = 119). This sample size exceeded common minimum requirements for factor analysis and hierarchical regression and provided adequate statistical power.
To strengthen the psychometric evaluation of the cognitive AI safety scale (CAISS), a second data collection round was conducted with an independent sample (N = 241). The first sample (N = 311) was used for exploratory factor analysis and hypothesis testing, whereas the second sample (N = 311) was used for confirmatory factor analysis (CFA) to examine the stability of the three-factor structure.
The survey instrument consisted of two main components and was administered bilingually. AI literacy was measured using a validated scale adapted from Wang et al. (2023) that captures two dimensions: technical AI proficiency (e.g. operational and prompt-related skills) and conceptual AI knowledge (e.g. understanding of how AI systems generate outputs). Both dimensions demonstrated acceptable internal consistency in the present study.
Because no existing instrument directly measures user-side cognitive safety in AI interaction, this study developed the CAISS using established scale development procedures (DeVellis and Thorpe, 2021). Its construction was guided by established principles of measurement development in educational and psychological research, including deductive item generation, expert review, pilot testing and separate exploratory and confirmatory analyses across independent samples. The scale was designed to capture three theoretically grounded dimensions. These three dimensions were selected because they represent the theoretically central components of user-side cognitive safety in GenAI-supported academic work in this initial operationalisation; they are not claimed to be the only possible dimensions of the construct.
Epistemic vigilance (verification and scepticism towards AI outputs).
Epistemic vigilance refers to the tendency to question, verify and evaluate information before accepting it as true. Research in cognitive and social psychology suggests that people typically use cues such as consistency, source credibility and plausibility to guard against misinformation (Sperber et al., 2010). However, generative AI systems often present information with fluency and confidence, which can weaken these natural safeguards and increase the risk of automation bias (Parasuraman and Manzey, 2010). In educational settings, epistemic vigilance is therefore essential to reducing uncritical acceptance of AI-generated content. This is particularly important when AI outputs appear authoritative but may contain subtle errors or hallucinations (Floridi et al., 2025), linking vigilance to users’ broader ability to manage interactions with AI tools and maintain cognitive agency.
Cognitive agency (resistance to unreflective cognitive offloading).
Cognitive agency refers to users’ active control over their thinking processes rather than delegating them entirely to external tools. Research on cognitive offloading shows that, while tools can reduce mental effort, excessive reliance may weaken engagement with underlying reasoning and problem-solving processes (Risko and Gilbert, 2016). In AI-supported learning, unreflective offloading may lead students to accept AI-generated ideas without developing their own understanding. Maintaining cognitive agency through practices such as drafting independently, questioning AI suggestions or using AI for critique rather than generation may support deeper learning and align with concerns raised in cognitive load and educational psychology research (Sweller, 1994).
Failure resilience (confidence in human judgement when AI fails).
Failure resilience refers to users’ confidence in their own judgement when AI systems produce errors or are unavailable and research in human–automation interaction indicates that over-reliance on automated systems can reduce users’ confidence and ability to act independently when automation fails (Endsley, 2016). In contrast, users with greater resilience maintain trust in their expertise, treat AI as support rather than authority and recover more effectively from system errors. In educational contexts, this resilience supports learner autonomy and reduces dependence on AI systems as epistemic authorities (Lee and See, 2004).
Items were generated from work in cognitive psychology, epistemic vigilance theory and research on automation bias and were subsequently refined through expert review and pilot testing. Quantitative analyses were conducted using IBM SPSS Statistics (Version 29). Firstly, exploratory factor analysis (EFA) with principal axis factoring and Promax rotation was used to examine the factorial structure of the CAISS. Sampling adequacy was assessed using the Kaiser–Meyer–Olkin (KMO) measure and Bartlett’s test of sphericity. Secondly, descriptive statistics and independent-samples t-tests were conducted to examine differences across countries and disciplines. Thirdly, hierarchical multiple regression was used to address RQ2. Demographic variables (age, gender and discipline) were entered as controls in Step 1, followed by the AI literacy dimensions in Step 2. This approach allowed the unique contribution of AI literacy to Cognitive AI Safety to be examined while controlling for demographic factors. Collinearity diagnostics were also checked to assess statistical robustness.
To strengthen the psychometric evaluation of the CAISS, a second data collection round was conducted with an independent sample (N = 241). This sample was then used in a CFA in IBM SPSS AMOS to test the stability of the three-factor structure identified in the exploratory phase. Since the study relied on self-reported survey data, procedural steps to reduce common method bias (CMB) were implemented: anonymous participation, the use of established measures when available and clear separation of construct wording across instrument sections.
Qualitative study
The qualitative phase explored how highly AI-literate users decided when to accept, delay or verify AI-generated outputs during academic tasks. Participants were purposively selected from the quantitative sample based on relatively high levels of AI literacy. The study aimed to understand decision-making processes among users who are prepared to recognise when AI use requires scrutiny. Rather than comparing high- and low-literacy users, the research examined how users well-equipped to use AI safely described selective verification, delayed checking and episodic trust in real academic contexts.
The interviews used a cognitive audit approach to examine decision-making during AI-supported academic work. Rather than focusing only on general views about AI, the interviews probed concrete episodes in which participants accepted, questioned or checked AI-generated outputs. Participants were asked what kinds of tasks they used AI for, when checking was necessary and what cues prompted a shift from routine acceptance to more active verification.
Qualitative data were analysed using a reflexive thematic approach (Braun and Clarke, 2019). Analysis involved repeated reading of the transcripts, initial memo-writing and iterative coding. A provisional coding framework was developed from the study’s theoretical concerns while remaining open to inductive themes emerging from participants’ accounts. Codes were refined through constant comparison and broader themes were developed by grouping related codes into higher-order patterns. Analytic memos were used throughout to document interpretive decisions and maintain reflexive awareness. Given the focused explanatory purpose of this phase, the sample of 12 participants was considered sufficient for thematic depth rather than demonstrating saturation in a universal sense.
Findings
Psychometric validation of the cognitive AI safety scale (RQ1)
To address RQ1, the structural validity of the newly developed CAISS was examined using an exploratory factor analytic approach. The data met all assumptions for factorability. The KMO measure of sampling adequacy was 0.767, exceeding the recommended minimum of 0.60. Bartlett’s test of sphericity was statistically significant (χ2 = 4768.454, df = 66, p < 0.001). This result indicates that item correlations were sufficient for factor analysis.
Factor extraction based on eigenvalues greater than 1.0 revealed a clear three-factor solution, consistent with the theoretical conceptualisation of cognitive AI safety. Together, the three factors accounted for 87.99% of the total variance, suggesting a clear structure; however, this relatively high proportion of explained variance should be interpreted cautiously, given the possibility of item overlap or shared-method variance. Items showed high primary loadings on their intended factors and negligible cross-loadings, supporting factorial clarity.
The three factors were interpreted as follows. Failure resilience (Items 9–12) accounted for 23.92% of the variance (eigenvalue = 2.870) and captured users’ confidence in human judgement and their ability to function effectively when AI systems fail. Cognitive agency (Items 1–4) accounted for 29.32% of the variance (eigenvalue = 3.518) and reflected resistance to unreflective cognitive offloading and the deliberate maintenance of independent thinking. Epistemic vigilance (Items 5–8) accounted for 34.75% of the variance (eigenvalue = 4.170) and represented active verification behaviours and scepticism towards fluent AI-generated outputs.
All items demonstrated high communalities (0.842–0.925), indicating that a substantial proportion of variance was explained by the factor structure (see Table 1). Inter-factor correlations were modest (r < 0.20), suggesting that the dimensions were related but distinct. Internal consistency was strong across all three subscales (Cronbach’s α = 0.82–0.88). So, these findings provide initial support for the CAISS’s multidimensional structure and internal reliability, although further longitudinal and cross-context validation remains necessary.
Factor loadings and reliability for the cognitive AI safety scale (CAISS)
| Item code | Item description | Failure resilience | Cognitive agency | Epistemic vigilance | Communalities |
|---|---|---|---|---|---|
| Item 12 | Assume problems are solvable by humans | 0.960 | 0.925 | ||
| Item 11 | Confidence in human intelligence | 0.945 | 0.898 | ||
| Item 10 | Trust domain knowledge to correct AI | 0.933 | 0.863 | ||
| Item 9 | No anxiety when AI is unavailable | 0.918 | 0.843 | ||
| Item 3 | Use AI to critique, not generate | 0.949 | 0.905 | ||
| Item 1 | Force manual first drafts | 0.943 | 0.888 | ||
| Item 4 | Avoid AI for simple tasks | 0.937 | 0.879 | ||
| Item 2 | Feel “lazy” if accepting blindly | 0.932 | 0.867 | ||
| Item 7 | Suspect logical errors in fluent text | 0.941 | 0.896 | ||
| Item 5 | Verify facts even if professional | 0.940 | 0.889 | ||
| Item 6 | Verify citations before use | 0.936 | 0.863 | ||
| Item 8 | Double-check if AI confirms opinion | 0.918 | 0.842 | ||
| Eigenvalue | 2.870 | 3.518 | 4.170 | ||
| % Variance | 23.92% | 29.32% | 34.75% | ||
| α | 0.82 | 0.85 | 0.88 |
| Item code | Item description | Failure resilience | Cognitive agency | Epistemic vigilance | Communalities |
|---|---|---|---|---|---|
| Item 12 | Assume problems are solvable by humans | 0.960 | 0.925 | ||
| Item 11 | Confidence in human intelligence | 0.945 | 0.898 | ||
| Item 10 | Trust domain knowledge to correct | 0.933 | 0.863 | ||
| Item 9 | No anxiety when | 0.918 | 0.843 | ||
| Item 3 | Use | 0.949 | 0.905 | ||
| Item 1 | Force manual first drafts | 0.943 | 0.888 | ||
| Item 4 | Avoid | 0.937 | 0.879 | ||
| Item 2 | Feel “lazy” if accepting blindly | 0.932 | 0.867 | ||
| Item 7 | Suspect logical errors in fluent text | 0.941 | 0.896 | ||
| Item 5 | Verify facts even if professional | 0.940 | 0.889 | ||
| Item 6 | Verify citations before use | 0.936 | 0.863 | ||
| Item 8 | Double-check if | 0.918 | 0.842 | ||
| Eigenvalue | 2.870 | 3.518 | 4.170 | ||
| % Variance | 23.92% | 29.32% | 34.75% | ||
| α | 0.82 | 0.85 | 0.88 |
Rotation method: Promax with Kaiser normalisation. Total variance explained = 87.99%. N = 311
A CFA was then conducted in AMOS using an independent sample (N = 241) to test the stability of this three-factor structure. The initial 12-item model showed inadequate fit, χ2/df = 11.619, comparative fit index = 0.862, Tucker-Lewis index = 0.821, incremental fit index = 0.862 and root mean square error of approximation = 0.210, so refinement was undertaken based on modification indices and theoretical consideration. Item 8 was removed because of poor performance and a correlated error term was specified between Items 1 and 2 because of their shared residual variance and conceptual similarity (Yu et al., 2025; Yockey and Kralowec, 2015). The revised three-factor measurement model was retained and showed acceptable fit, χ2/df = 2.447, CFI = 0.983, TLI = 0.976, IFI = 0.983 and RMSEA = 0.078. The independent CFA supported a revised 11-item, three-factor version of the CAISS after removal of Item 8 and specification of one theoretically justified correlated error
The revised model showed satisfactory reliability and validity (Table 2). Cronbach’s α ranged from 0.944 to 0.967 and composite reliability (CR) ranged from 0.946 to 0.968, exceeding the recommended threshold of 0.70 (Bagozzi and Yi, 1988). Standardised factor loadings ranged from 0.816 to 0.997 and average variance extracted (AVE) ranged from 0.814 to 0.883, exceeding the recommended threshold of 0.50 (Fornell and Larcker, 1981). Discriminant validity was also supported. The square roots of the AVE exceeded the corresponding inter-construct correlations and the Heterotrait-Monotrait ratio (HTMT) values ranged from 0.034 to 0.229, all below the recommended cut-off (Henseler et al., 2014) (Table 3).
Reliability and validity
| Construct | Mean | SD | Cronbach’s α | CR | AVE | Std. loading range |
|---|---|---|---|---|---|---|
| Cognitive agency (CA) | 4.266 | 0.682 | 0.956 | 0.946 | 0.814 | 0.816–0.982 |
| Epistemic vigilance (EV) | 3.148 | 0.737 | 0.944 | 0.949 | 0.862 | 0.831–0.997 |
| Failure resilience (FR) | 3.990 | 0.356 | 0.967 | 0.968 | 0.883 | 0.918–0.968 |
| Construct | Mean | Cronbach’s α | Std. loading range | |||
|---|---|---|---|---|---|---|
| Cognitive agency ( | 4.266 | 0.682 | 0.956 | 0.946 | 0.814 | 0.816–0.982 |
| Epistemic vigilance ( | 3.148 | 0.737 | 0.944 | 0.949 | 0.862 | 0.831–0.997 |
| Failure resilience ( | 3.990 | 0.356 | 0.967 | 0.968 | 0.883 | 0.918–0.968 |
HTMT ratios and Fornell–Larcker analyses
| HTMT ratios | Fornell–Larcker | |||||
|---|---|---|---|---|---|---|
| Factors | CA | EV | FR | CA | EV | FR |
| CA | 0.036 | 0.034 | 0.902 | 0.006 | −0.052 | |
| EV | 0.036 | 0.229 | 0.006 | 0.928 | 0.239 | |
| FR | 0.034 | 0.229 | −0.052 | 0.239 | 0.940 | |
| Fornell–Larcker | ||||||
|---|---|---|---|---|---|---|
| Factors | ||||||
| 0.036 | 0.034 | 0.902 | 0.006 | −0.052 | ||
| 0.036 | 0.229 | 0.006 | 0.928 | 0.239 | ||
| 0.034 | 0.229 | −0.052 | 0.239 | 0.940 | ||
CMB was assessed using Harman’s single-factor test, CFA model comparisons and an unmeasured latent method construct (ULMC) approach (Richardson et al., 2009). Harman’s single-factor test showed that the first factor accounted for 36.49% of the total variance, below the commonly used 40% threshold. The single-factor CFA model fit the data poorly, χ2/df = 50.671, RMSEA = 0.455, CFI = 0.364, TLI = 0.186 and IFI = 0.366, whereas the revised three-factor measurement model showed substantially better fit. The ULMC model also fit the data well, χ2/df = 1.951, RMSEA = 0.063, CFI = 0.992, TLI = 0.984, IFI = 0.992, goodness-of-fit index = 0.960 and adjusted goodness-of-fit index = 0.909, but did not materially improve on the revised three-factor model (Figure 1). Taken together, these results suggest that CMB is unlikely to materially affect the findings. These findings provide support for the revised three-factor structure of cognitive AI safety and indicate that the CAISS has satisfactory psychometric properties in this independent validation sample. The EFA supported the theorised three-dimensional structure, while the independent CFA supported a revised 11-item version of the CAISS.
The structural equation model displays relationships among 3 latent variables labelled C A, E V, and F R, represented by ellipses connected to multiple observed indicators labelled C A I S S 1 through C A I S S 12. Rectangular boxes represent the observed variables, and circular nodes labelled e 1 through e 12 represent error terms associated with each indicator. Factor loadings are shown along the directional arrows connecting latent variables to observed indicators, with values ranging from 0.82 to 1.00. Curved double-headed arrows between the latent variables indicate correlations of minus 0.05, 0.01, and 0.24. The layout groups indicators C A I S S 1 to C A I S S 4 under C A, C A I S S 5 to C A I S S 7 under E V, and C A I S S 9 to C A I S S 12 under F R.CFA measurement model
Source: Authors’ own work
The structural equation model displays relationships among 3 latent variables labelled C A, E V, and F R, represented by ellipses connected to multiple observed indicators labelled C A I S S 1 through C A I S S 12. Rectangular boxes represent the observed variables, and circular nodes labelled e 1 through e 12 represent error terms associated with each indicator. Factor loadings are shown along the directional arrows connecting latent variables to observed indicators, with values ranging from 0.82 to 1.00. Curved double-headed arrows between the latent variables indicate correlations of minus 0.05, 0.01, and 0.24. The layout groups indicators C A I S S 1 to C A I S S 4 under C A, C A I S S 5 to C A I S S 7 under E V, and C A I S S 9 to C A I S S 12 under F R.CFA measurement model
Source: Authors’ own work
Impact of AI literacy on cognitive safety (RQ2)
To address RQ2, a hierarchical multiple regression analysis was conducted to examine whether AI literacy predicts cognitive AI safety after controlling for demographic factors. Age, gender and field of study were entered in Step 1, followed by the two dimensions of AI literacy, namely, technical AI proficiency and conceptual AI knowledge, in Step 2.
In Step 1, demographic variables did not significantly predict cognitive AI safety. They accounted for only 1.2% of the variance [R2 = 0.012, F(3, 307) = 1.281, p = 0.281]. The introduction of AI literacy variables in Step 2 led to a significant increase in explained variance (ΔR2 = 0.132, F change = 23.443, p < 0.001). The final model was significant, F(5, 305) = 10.258, p < 0.001, accounting for 14.4% of the total variance in cognitive AI safety (R2 = 0.144). Although the model was statistically significant, its explanatory power was modest. This indicates that AI literacy accounts for an important but limited portion of variation in cognitive AI safety.
After accounting for AI literacy, gender was a statistically significant predictor of cognitive AI safety (β = 0.117, p = 0.031), though its effect was small. As gender was not a theorised variable and no interactions were tested, this significant result is highlighted for transparency but not interpreted in detail.
Analysis of individual predictors showed a proficiency–protection effect. Technical AI proficiency was the strongest positive predictor (β = 0.241, t = 3.054, p = 0.002), with conceptual AI knowledge also significant (β = 0.174, t = 2.362, p = 0.019). Thus, higher AI literacy is associated with stronger epistemic vigilance, cognitive agency and resilience to generative AI. Control variables, age and field of study, were not significant. Gender became significant in the final model (β = 0.117, p = 0.031), suggesting that cognitive AI safety may vary by gender when literacy is considered. As interaction effects were not tested, this is reported descriptively. Collinearity diagnostics confirmed the model’s robustness, with variance inflation factor (VIF) values ranging from 1.005 to 2.216, below conservative thresholds (see Table 4). In summary, the results show a positive association between AI literacy and cognitive AI safety, leading to the rejection of H1, while the hypothesised negative relationship in H2 was not supported
Hierarchical regression analysis predicting cognitive AI safety (N = 311)
| Predictor | Model 1 (β) | Model 2 (β) | t | p | VIF |
|---|---|---|---|---|---|
| Step 1: Controls | |||||
| Age | −0.008 | 0.000 | 0.009 | 0.993 | 1.005 |
| Gender | 0.099 | 0.117 | 2.172 | 0.031* | 1.026 |
| Fields | −0.066 | 0.049 | 0.841 | 0.401 | 1.224 |
| Step 2: AI literacy | |||||
| Technical AI | 0.241 | 3.054 | 0.002* | 2.216 | |
| Conceptual AI | 0.174 | 2.362 | 0.019* | 1.944 | |
| R2 | 0.012 | 0.144 | |||
| ΔR2 | 0.132 | ||||
| Sig. F change | <0.001 | ||||
| Predictor | Model 1 (β) | Model 2 (β) | t | p | |
|---|---|---|---|---|---|
| Step 1: Controls | |||||
| Age | −0.008 | 0.000 | 0.009 | 0.993 | 1.005 |
| Gender | 0.099 | 0.117 | 2.172 | 0.031 | 1.026 |
| Fields | −0.066 | 0.049 | 0.841 | 0.401 | 1.224 |
| Step 2: | |||||
| Technical | 0.241 | 3.054 | 0.002 | 2.216 | |
| Conceptual | 0.174 | 2.362 | 0.019 | 1.944 | |
| R2 | 0.012 | 0.144 | |||
| ΔR2 | 0.132 | ||||
| Sig. F change | <0.001 | ||||
Dependent variable: CAISS_Total. Standardised beta coefficients (β) are reported
*Significant at p < 0.05
Although the quantitative findings demonstrate a positive association between AI literacy and cognitive AI safety, these results do not explain how cognitive safety is enacted during actual interactions with generative AI. Statistical relationships indicate that literacy matters, but not when, why or under what conditions users choose to verify, question or rely on AI outputs. Prior research suggests that safety-related behaviour is often situational and influenced by task demands, perceived stakes and cognitive effort. Therefore, the qualitative phase was designed to examine the decision-making processes underlying AI use among high-literacy users, focusing on how they rationalise accepting AI outputs without verification (RQ3) and what specific cues trigger a shift from trust to scepticism (RQ4). This approach provides a more detailed understanding of Cognitive AI Safety as a context-dependent capability rather than a constant behaviour.
Qualitative findings: conditional enactment of cognitive AI safety (RQ3 and RQ4)
The qualitative phase examined how highly AI-literate users explained when and why they accepted, delayed or verified AI-generated outputs during academic tasks. The quantitative results showed that AI literacy was positively associated with cognitive AI SAFETY, but interviews revealed that participants did not enact this safety consistently. Participants described verification as selective, influenced by context, task requirements and perceived outcomes. Two broad processes were identified:
reasons for delaying or reducing verification (RQ3); and
conditions that prompted active checking (RQ4).
RQ3: Why highly AI-literate users do not always verify AI outputs
Participants did not describe skipping verification as simple carelessness. Rather, they appeared to make deliberate decisions about when checking was necessary and when it could be postponed. Three related themes were identified.
Checking depended on how important the task seemed.
Participants approached AI-supported tasks with different levels of caution. They were more likely to accept AI-generated output without close checking it for low-stakes tasks such as brainstorming or early drafting. Because mistakes could be fixed later, they were deemed acceptable in these circumstances. “I don’t feel the need to double-check everything if it is only an early draft or just ideas. I can always correct it afterwards.” (S3) This suggests that verification was often delayed rather than ignored. Based on the task’s importance, participants seemed to determine whether immediate checking was worthwhile.
AI literacy sometimes increased confidence in delayed checking.
A second pattern showed that higher AI literacy sometimes gave participants the confidence to spot errors later. This made them more comfortable working quickly at first, as they believed they knew how AI works and where it might fail. “I trust it enough to move fast, but I also trust myself to catch mistakes if they show up.” (S6) In this way, AI literacy acted as a source of confidence, but this did not always lead to immediate scrutiny; instead, it sometimes led participants to delay a thorough review until later.
Verification was weighed against time and mental effort.
Participants often described checking as a practical decision that involved efficiency, time and effort. Some questioned whether using AI was worthwhile if it took as much work to verify the output as it did to complete the task on their own. “Verifying everything is not always necessary. The purpose of the tool is to save time.” (S8) This implies that participants used mental effort sparingly and viewed it as a finite resource. As a result, verification was influenced by both knowledge and the perceived value of taking the time to check.
Taken together, these themes suggest that highly AI-literate users did not reduce verification because they lacked awareness. Rather, they appeared to prioritise their checking strategically. AI literacy gave them the capacity to evaluate outputs, but that capacity was not applied uniformly across situations.
RQ4: What prompts users to begin checking AI outputs
Participants described routinely accepting AI output but identified circumstances that led them to adopt a more cautious, critical stance. Three primary triggers emerged.
Familiarity with the topic increased doubt.
The most common trigger was background knowledge. Participants were much more likely to notice problems when the AI produced content in an area they already knew well. In those cases, even a small error or an unusual phrasing could prompt immediate doubt. As one participant put it, “When it touched something I really know well, I stopped trusting it straight away.” (S9) This suggests that domain knowledge plays an important role in enabling cognitive AI safety. When users have enough prior understanding of the topic, they are better able to detect possible inaccuracies.
Overly confident answers created suspicion.
Participants also became suspicious when the AI provided responses that seemed overly neat or highly confident in areas they considered complex or uncertain. For some, this fluency was not reassuring. Instead, it raised concern that the response might be overly simplified or unreliable. “If it shows a very clean way to solve a messy problem, alarms go off.” (S12) This suggests that trust was not always improved by fluent language. Occasionally, the output was highly polished, which prompted a more thorough examination, especially when the subject matter was expected to be ambiguous or contentious.
Accountability increased verification.
Participants were more likely to verify outputs when they felt personally accountable for the final result. This was especially clear in high-stakes contexts such as graded work or publication. “If my name is on it, I will verify everything” (S1). This suggests that verification behaviour was strongly shaped by accountability. People were more inclined to transition from regularly using an item to actively verifying it when the penalties for error were greater.
In general, the qualitative findings indicate that cognitive AI safety was implemented as a capability that could be activated under specific circumstances, rather than as a fixed behaviour that was consistently present. Participants reported alternating between routine acceptance and active checking, contingent on the topic, the AI response and the degree of personal responsibility. This helps explain why high AI literacy may not always translate into consistently safe use: even highly proficient users appeared to apply scrutiny selectively depending on the situation. The qualitative themes and the evidence that substantiates them are summarised in Table 5.
Summary of qualitative themes and illustrative evidence
| Research question | Theme | Description | Illustrative quote | Interpretation |
|---|---|---|---|---|
| RQ3 | Task importance shaped checking | Participants checked less in low-stakes tasks such as brainstorming or drafting | “If it’s just an early draft or ideas, I don’t feel the need to check everything.” (S3) | Verification was often delayed rather than ignored |
| RQ3 | Confidence supported delayed checking | High AI literacy increased confidence in noticing errors later | “I trust it enough to move fast, but I also trust myself to catch mistakes if they show up.” (S6) | Literacy supported confidence, but not always immediate scrutiny |
| RQ3 | Effort–time trade-off | Participants weighed the value of checking against the effort required | “Checking everything defeats the purpose.” (S8) | Verification was shaped by efficiency considerations |
| RQ4 | Topic familiarity triggered doubt | Domain knowledge helped participants detect problems quickly | “When it touched something, I really know well, I stopped trusting it straight away.” (S9) | Background knowledge supported critical evaluation |
| RQ4 | Overconfident fluency triggered caution | Very polished answers to complex issues raised suspicion | “If it gives a very clean answer to a messy problem, that’s when alarms go off.” (S12) | Fluency sometimes prompted scrutiny rather than trust |
| RQ4 | Accountability triggered verification | High-stakes tasks increased checking behaviour | “If my name is on it, I check everything.” (S1) | Personal responsibility influenced active verification |
| Research question | Theme | Description | Illustrative quote | Interpretation |
|---|---|---|---|---|
| RQ3 | Task importance shaped checking | Participants checked less in low-stakes tasks such as brainstorming or drafting | “If it’s just an early draft or ideas, I don’t feel the need to check everything.” (S3) | Verification was often delayed rather than ignored |
| RQ3 | Confidence supported delayed checking | High | “I trust it enough to move fast, but I also trust myself to catch mistakes if they show up.” (S6) | Literacy supported confidence, but not always immediate scrutiny |
| RQ3 | Effort–time trade-off | Participants weighed the value of checking against the effort required | “Checking everything defeats the purpose.” (S8) | Verification was shaped by efficiency considerations |
| RQ4 | Topic familiarity triggered doubt | Domain knowledge helped participants detect problems quickly | “When it touched something, I really know well, I stopped trusting it straight away.” (S9) | Background knowledge supported critical evaluation |
| RQ4 | Overconfident fluency triggered caution | Very polished answers to complex issues raised suspicion | “If it gives a very clean answer to a messy problem, that’s when alarms go off.” (S12) | Fluency sometimes prompted scrutiny rather than trust |
| RQ4 | Accountability triggered verification | High-stakes tasks increased checking behaviour | “If my name is on it, I check everything.” (S1) | Personal responsibility influenced active verification |
Discussion
The quantitative findings suggested a positive association between AI literacy and cognitive AI safety. At the same time, the modest effect size indicates that this relationship is only part of a broader pattern shaped by task demands, motivation, accountability and situational factors. This quantitative pattern is also broadly consistent with emerging evidence that AI literacy may support adjacent academic capabilities. For example, Bui et al. (2026), in a study of university students, reported a statistically significant positive relationship between AI literacy and research skills, with AI application emerging as the strongest predictor. Although research skills are not identical to cognitive AI safety, their convergence suggests that AI literacy may serve as a supportive resource for higher-order academic capabilities rather than simply increasing dependence on automated systems.
Building on these quantitative insights, the qualitative findings suggested a more complex pattern. While technical competence appeared to enable vigilance, it did not lead to continuous verification in practice. Highly AI-literate users often described delaying or reducing their checking because they believed they could identify problems later if necessary. In this sense, AI literacy sometimes served as a source of confidence, making selective offloading feel acceptable. This pattern suggests that AI literacy does not function as a constant safeguard. Rather, it appears to be a resource that users may selectively deploy depending on the perceived stakes, the effort required and the likelihood of error, which indicates that users might rely on their AI literacy more in high-stakes situations where the consequences of mistakes are significant.
Strategic suspension of verification
The findings suggest that technical and conceptual AI knowledge may support safer use, but do not ensure that users will continuously verify outputs. Highly AI-literate participants often described using their expertise to justify delayed checking, particularly when the task was low stakes or when immediate speed was prioritised. In these situations, participants appeared willing to accept a “good enough” response rather than invest effort in immediate verification. This pattern is consistent with bounded rationality (Simon, 1972), in which individuals make decisions that are satisfactory for the immediate task rather than optimising for complete accuracy or full verification.
In educational contexts, this suggests that greater AI literacy may sometimes make selective offloading feel more acceptable. Participants often implied that, because they understood AI’s limitations, they would be able to detect errors later if needed. While this confidence may help users work efficiently, it may also reduce immediate engagement with the reasoning process itself. Three related implications follow from this pattern. Firstly, when AI makes tasks easier, users might not think as deeply about the reasoning behind learning. Secondly, the apparent efficiency of AI may encourage students to prioritise speed and polished output over the effortful processes that often support deeper understanding. Thirdly, the value of AI may increasingly be judged by convenience, even when that convenience reduces opportunities for active verification.
These findings do not mean efficiency is bad. Instead, the educational value of AI depends on whether convenience supports learning or replaces the cognitive work needed for learning. Future research can examine this directly by measuring perceived workload, time allocation and error-detection performance during AI-supported tasks, using tools like the NASA task load index (Hart, 2006).
Triggers of reactive verification
The qualitative findings suggest that even highly AI-literate users did not continuously monitor AI outputs. Instead, verification tended to be activated reactively when particular cues made the risk of error more salient. This pattern helps explain why cognitive AI safety was not enacted as a constant form of scrutiny but as a selective response shaped by context.
Recent evidence from generative AI research also supports the view that awareness does not translate into a single or stable behavioural response. Dahabiyeh et al. (2026), in a study of ChatGPT privacy awareness, found that different dimensions of awareness were associated with continued use in different ways and could either heighten or reduce perceived risk depending on what users were attending to. Although that study focused on privacy rather than cognitive safety, it supports a broader point relevant here: awareness may be necessary for safer engagement, but it does not guarantee sustained vigilance unless specific risks become salient.
In the present study, participants described several such triggers. Verification was more likely when they had strong background knowledge of the topic, when the AI produced an answer that seemed overly confident about a complex issue or when the task involved clear personal accountability, such as graded work or publication. These patterns are consistent with dual-process theory (Groves and Thompson, 1970), which distinguishes between relatively rapid, intuitive acceptance and slower, more analytical checking. In routine or low-stakes situations, participants often appeared to remain in a more efficient, low-monitoring mode. More effortful checking was usually used only when a cue made it seem that relying on something might be dangerous.
This interpretation is also broadly consistent with prior work on epistemic vigilance (Sperber et al., 2010), which argues that people rely on cues such as inconsistency, uncertainty or implausibility when evaluating information. It also aligns with research on trust calibration, which suggests that safe use of automated systems depends on adjusting reliance to the actual performance and uncertainty of the system rather than simply trusting or distrusting it in general (Balfe et al., 2018; Holland et al., 2024; Lebiere et al., 2021; Wischnewski et al., 2023; Lee and See, 2004).
A particularly important finding was the role of domain knowledge. Participants who were already familiar with a topic appeared better able to detect possible inaccuracies or oversimplifications in AI-generated responses. This is consistent with Endsley (2016), who argued that users without sufficient background knowledge may lack the reference points needed to recognise subtle errors in automated systems. In this sense, safe AI use does not seem to depend on general distrust of the technology, but rather on having sufficient knowledge to judge when its outputs are plausible, useful or in need of closer scrutiny.
Effort allocation and verification in AI-supported learning
From the perspective of cognitive load theory, AI-supported environments may alter how students allocate effort during academic work. When generative AI provides fluent, ready-made responses with little delay, users may invest less effort in the reasoning processes that would otherwise support more profound engagement with the task. This interpretation aligns with Sweller (1994), who contended that learning may be diminished when technology reduces productive cognitive engagement rather than merely alleviating unnecessary difficulty.
The present findings suggest that many students treated mental effort as a limited resource and made selective decisions about when to verify, weighing the cost against the benefit. Under conditions of time pressure or heavy workloads, speed and convenience are often prioritised over immediate verification. This pattern is also consistent with research on cognitive offloading, which indicates that people are willing to delegate mental work to external tools when doing so appears efficient and sufficiently low risk (Risko and Gilbert, 2016). In this study, even highly AI-literate participants often described fact-checking as a cost–benefit decision, choosing not to verify when the effort required seemed disproportionate to the perceived value of being certain.
At the same time, accountability appeared to change this calculation. Participants reported investing more effort in checking when the task had visible consequences, such as a graded assignment, publication or professional report. This suggests that verification was shaped not only by competence, but also by the perceived stakes of being wrong. When academic environments reward speed and polished output without attending to the process by which that output is produced, students may have fewer incentives to maintain continuous scrutiny. This interpretation aligns with Selwyn (2024), who argued that educational practice should make the uncertainty and fallibility of AI more visible within teaching and assessment.
Taken together, these findings suggest that cognitive AI safety is not best understood as a fixed habit but as a context-dependent capability shaped by effort, perceived risk and accountability. From a pedagogical perspective, this implies value in designing tasks that preserve opportunities for independent thinking rather than allowing AI to replace the cognitive work through which learning develops.
Implications for pedagogy, institutions and policy
Cognitive AI safety is supported by AI literacy, but only under specific circumstances, according to this study. In other words, students may be able to challenge and validate AI outputs; however, they may not always do so. Safety is influenced by context, task demands and perceived consequences. This discovery has significant implications for intelligent education, as it is insufficient to merely teach AI skills. These findings underscore the significance of establishing learning environments that foster critical thinking at the appropriate time.
Pedagogical implications
Most current AI literacy initiatives focus on helping students use AI tools effectively and efficiently. While these skills are important, the findings suggest that efficiency can unintentionally discourage verification. Students might think that checking is pointless or a waste of time when AI quickly produces fluent and polished outputs. To address these issues, smart education may benefit from shifting its focus from producing efficient “power users” to developing critically engaged users who know when to slow down, question outputs and take responsibility for decisions. Cognitive AI Safety may be treated as a learning outcome alongside technical proficiency.
A key pedagogical shift suggested by this study is moving from training students as power users to preparing them as adversarial auditors. Adversarial auditors do not take AI outputs at face value; instead, they treat them as starting points to be tested. For example, in a science, technology, engineering, and mathematics or social science course, students could be given an AI-generated solution containing a subtle mistake and asked to identify and explain the error. Assessment plays a critical role here. Rather than grading only the final polished product, which AI can easily generate, educators should reward the verification process, such as evidence of fact-checking, comparison with trusted sources or reflection on why an AI output was accepted or rejected.
The Cognitive Economy model also highlights a central risk of AI-supported learning: when tasks become too easy, students may invest less mental effort. To sustain cognitive agency, learning environments may benefit from incorporating forms of desirable friction. For instance, intentional pauses or constraints that require independent thinking. Two practical strategies are especially useful. Firstly, students can complete manual first drafts before using AI for critique or improvement. This helps build domain knowledge and makes them more sensitive to errors in AI outputs. Secondly, educators can use honey-trap activities by introducing AI-generated content with small but meaningful errors for students to detect and correct. These approaches do not reject AI use; instead, they ensure that AI supports learning without replacing human judgement.
AI education should also teach not only how to use AI, but also when trust is appropriate. Students should learn common patterns of AI failure, such as confident hallucinations, incorrect citations and oversimplified explanations. Instructors can explicitly discuss why overly confident answers to complex questions should raise suspicion, why AI-generated citations require verification and why agreement with one’s own opinion does not guarantee correctness. Making these risks visible can help students activate epistemic vigilance more reliably.
Institutional implications
At the institutional level, responsible AI use requires more than student-facing guidance. Universities may need structures that support verification, reflection and appropriate human oversight. This includes workshops or training activities that help students and staff recognise when AI confidence should prompt scepticism, as well as course-level expectations that make checking, reflection and accountability visible parts of academic practice.
Institutional responses should also extend to professional and support roles across the university. Adewojo et al. (2026), examining AI-driven tools in academic libraries, found that AI improved efficiency and reduced routine workload but also highlighted its limitations in handling complex inquiries, the continued need for human oversight and the importance of ongoing professional development in digital literacy and AI management. This reinforces the argument that responsible AI integration in higher education depends on institutional capacity-building, including staff training, hybrid oversight structures and clear expectations for when human judgement must remain central.
Subject to further validation, the CAISS may also prove useful in future research and educational design contexts to examine variation in students’ cognitive safety tendencies. However, broader diagnostic or institutional application should remain cautious until further longitudinal and cross-context validation is established.
Policy implications
AI policies in higher education usually focus on academic integrity or technical ethics, like bias and transparency. These findings support adding a third focus: cognitive AI safety. Policy should not only decide whether AI use is allowed, but also ensure that students check outputs, maintain cognitive control and remain accountable for AI-driven decisions. In practice, institutions should explicitly embed clear expectations for students to verify and reflect on AI-generated content within course guidelines, assessment criteria and institutional AI policies. Policies may urge faculty and administrators to position AI use as an opportunity to promote responsible human–AI collaboration, explicitly stating that AI outputs must be checked and interpreted by students. By adopting these measures, higher education can foster a culture of accountability and thoughtful engagement with AI tools.
Building a culture of responsible human–AI collaboration
Ultimately, advancing cognitive AI safety requires a cultural shift in how generative AI is framed in education: not as an authority or substitute for thought, but as a tool whose value depends on critical use. For example, instead of instructing students to use AI to complete assignments, educators should first have students generate their own ideas, then have them use AI to critique or enhance their work. Assessment can require students to reflect on when they accepted or rejected AI suggestions and why. When pedagogy, support and policy reinforce the view that AI is a tool, not a decision-maker, students are more likely to trust their own judgement while benefiting from AI. Smart education is not about perfect AI integration, but about preparing learners to think with AI and retain control.
Limitations and future directions
This study should be interpreted considering several limitations. Firstly, although the CAISS was supported through both EFA and CFA across independent samples, its validation remains preliminary. Further research is needed to examine the stability and generalisability of the three-factor structure using larger and more diverse samples, as well as longitudinal and cross-cultural designs.
Secondly, the study relied on self-reported measures of both AI literacy and cognitive AI safety. As a result, the findings may be affected by social desirability bias or by an “illusion of explanatory depth,” in which participants overestimate their actual level of vigilance. Because both constructs were measured via self-report within the same survey context, CMB cannot be ruled out. Although the use of separate constructs and independent CFA validation strengthens confidence in the measurement model, future research should incorporate additional statistical checks and more multi-method designs, such as longitudinal studies or experimental approaches, to provide a more comprehensive understanding of the constructs involved.
Thirdly, the present study examined only direct effects. Although the theoretical framework suggests that the relationship between AI literacy and cognitive AI safety may be mediated or moderated, these processes were not tested here. Future research should therefore examine whether factors such as task stakes, accountability, domain expertise, motivational orientation or time pressure mediate or moderate this relationship. In addition, future research could compare how users with lower AI literacy levels enact cognitive AI safety compared to those with higher levels.
Fourthly, the qualitative phase focused on participants from Southeast Asian contexts, specifically Indonesia and Malaysia. While this strengthens the study’s contextual relevance, it limits the ability to account for broader cultural variation in trust towards technology and perceptions of authority. Future cross-cultural research is needed to examine whether cognitive AI safety operates similarly across different educational and cultural settings. Longitudinal work would also be valuable in examining how cognitive AI safety develops over time as users become more familiar with generative AI.
Finally, the study focused on general academic tasks rather than subject-specific contexts. Future research should investigate whether verification behaviour varies across different disciplines, particularly between fields that rely more heavily on objective accuracy, such as STEM, and those in which interpretation and argumentation play a larger role, such as the humanities. In addition, although gender emerged as a small but statistically significant predictor of cognitive AI safety after controlling for AI literacy, the present study did not theorise gender differences or examine their underlying mechanisms. This finding should therefore be interpreted cautiously. Future research could explore whether gender-related differences in trust calibration, risk perception or accountability expectations shape how cognitive AI safety is enacted in different task contexts.
Conclusion
This study provides initial evidence that AI literacy is positively associated with cognitive AI safety, but that this relationship is conditional rather than automatic. Although technical proficiency and conceptual understanding seem to facilitate safer interaction with generative AI, the qualitative results indicate that verification is performed selectively based on task requirements, perceived risks, domain expertise and accountability. In this sense, cognitive AI safety is best understood not as a constant habit but as a context-dependent capability. The study contributes by introducing cognitive AI safety as a more focused, education-specific construct and by providing initial validation of the CAISS. More broadly, the findings suggest that the educational challenge of generative AI is not only about whether students can use these tools effectively, but also about whether they can do so while maintaining independent judgement. As AI becomes increasingly embedded in academic work, higher education may benefit from approaches that support not only technical skill but also verification, reflection and appropriate reliance.

