Skip to Main Content
Article navigation

In the rapidly evolving landscape of scholarly communication, the intersection of artificial intelligence (AI) and open science presents both unprecedented opportunities and complex challenges. Elena Giglia, a prominent voice in the Italian and international open science community, offers her reflections on this dynamic field. As a member of the Opportunity Area Group on Open Scholarly Communication of the EOSC Association and a key contributor to several Horizon Europe open science projects, Elena is at the forefront of shaping the future of research. Her extensive involvement in international conferences and workshops on FAIR data and her dedication to training and promoting open science principles make her a leading authority on the subject. This interview explores Elena’s perspective on how AI is transforming knowledge creation and dissemination within the open science paradigm. We delve into the potential impact of AI on academic integrity and discuss the necessary regulations to guide its responsible use by students and educators.

We are still at the beginning of the AI era, despite its widespread adoption and prevalence as a buzzword across various domains. I would like to focus here on generative AI and large language models, as they are the most relevant for research.

To answer this question, I would start by inverting it: how can open science influence AI? People tend to perceive AI as a “huge search engine” that delivers instant results. This is incorrect. AI, to function effectively, requires training. The principle of “garbage in, garbage out” is crucial here: if you train AI on low-quality or biased data, the output will inevitably be flawed. Therefore, the primary interaction I see between AI and open science is the necessity of high-quality, FAIR Open data for training AI models. This forms the foundation for any sound application of AI in research. Data needs to be FAIR, which not only stands for Findable, Accessible, Interoperable and Reusable but can also be interpreted as Fully AI Ready. FAIR data are structured and well-described by machine-readable metadata to facilitate the work of algorithms and help prevent hallucinations. Data and text must be Open with appropriate licenses to ensure a critical mass of training material for AI (and avoid copyright issues). Relying on large quantities of data will lead to better results in AI training, minimizing errors and over/under predictions.

Conversely, AI models and applications themselves need to be FAIR and Open to enable reproducibility and ensure trusted provenance.

I see a significant opportunity for open science practitioners to provide trusted, FAIR materials for AI training.

On the other hand, I do not see specific benefits of AI for open science but rather for science in general, provided that the use of AI is transparent and declared.

The challenges I foresee with the massive spread and adoption of AI relate to its resource-intensive nature, environmental impact and high costs, which could lead to inequities and thus contradict open science principles.

Here again, we should first consider the reverse: FAIR Open data is the bedrock of trustworthy AI.

Regarding the benefits of AI application to research, I would highlight AI’s capability to process vast amounts of information and extract patterns or links that are unpredictable to the human eye, thereby generating new knowledge. There is immense potential here, provided that the information is structured semantically and is machine-readable (e.g. the so-called “nanopublications”).

Another advantage is the use of AI itself for massive and automated metadata extraction from unstructured data, making them FAIR.

These are examples of the advantages AI can offer to research in general. Open science is a methodology; it involves conducting research collaboratively, co-creating knowledge, being responsive to societal needs and operating transparently by providing access to all components of research “as early and open as possible” (and I would add, “as FAIR as possible”). I see no specific advantages of AI for open science, but rather for science itself.

The key is to be open and transparent. Even in the application of AI, we must avoid “black boxes” – which is currently the case with large commercial companies competing in the AI market – and strive for reproducibility at every level: data, software and algorithms.

The use of AI in any research – precisely because of its potential power – must be disclosed and included in the “Methodology” section of any publication.

FAIR principles must be adapted and adopted for AI models and applications as well, thereby ensuring their compliance with the open science paradigm. Several AI researchers are already working in this direction.

Referring to these as “challenges” is an understatement. As recent articles have acknowledged, AI is significantly amplifying research misconduct. The scientific blog Retraction Watch has observed a sharp increase in the number of papers retracted due to being written by AI, to the point where they have created a new dedicated section. Where were the reviewers? Astonishingly, even the authors did not notice sentences like “unfortunately I am only an AI language model” (as published in an Elsevier journal). Guillaume Cabanac [1], a young researcher, developed algorithms to detect suspicious sentences and has found AI-generated papers throughout the scientific literature (ironically, using AI to detect AI).

The core of the problem lies in the current research assessment system, which is primarily based on quantitative indicators and the use of rankings. In their pursuit of another line on their CV, many researchers have resorted to scientific misconduct, ranging from data fabrication and falsification to, at the very least, “data make-up.” It is important to note that we are not discussing so-called “predatory” Open Access journals here; all the aforementioned examples come from journals considered “prestigious,” with the highest impact factors (there is extensive literature on this topic). If research is reduced to the number of papers published per year, a system in which papers are generated by AI – undoubtedly the worst possible use of AI in science – is almost inevitable.

Research integrity, and consequently public trust, is at stake. According to Elisabeth Bik [2], an expert in the field, “we need to slow down.”

If we do not change the research assessment criteria, the “publish or perish” culture will undermine science, especially with the powerful assistance of AI.

This is why initiatives like COARA, the Coalition for the Advancement of Research Assessment [3], which has been signed by 832 institutions in Europe as of March 2025, are so crucial. COARA’s commitments focus on prioritizing quality over quantity and abandoning the inappropriate use of impact factor and ranking-based metrics.

The reform of research assessment should include considering all research outputs, not just publications, and valuing diverse tasks within academia (peer review, mentorship, team working […], while respecting the specificities of individual disciplines.

COARA presents a unique opportunity to shape a different research culture – one that is collaborative and transparent, rather than hypercompetitive. In such an environment, we will likely see a more ethical use of AI, rather than its prevalent application for fraudulent purposes.

Firstly, by reforming the assessment and reward system. Shifting the focus away from the number of publications and the “prestige” of the journal in which they appear can help alleviate the pressure of the “publish or perish” culture and foster a less toxic environment. AI is merely one of the tools exploited in scientific misconduct; the fundamental issue (again, extensively documented in the literature) is research assessment.

By adopting COARA principles and broadening the scope of evaluation to include data sets, methods, protocols and software, institutions can reduce the pressure on researchers and simultaneously cultivate a healthier and more robust research environment.

By mandating (through policies) that all components of the research cycle are FAIR and Open, institutions can take the initial step towards properly assessing research integrity. How can we evaluate the integrity of the entire research process or determine whether outputs are sound or simply AI-generated content if we lack access to the data, software, methods and so on? To assess the integrity of the scientific process, we need to examine everything from the initial hypotheses to the final outputs (which, of course, extend beyond publications).

Reproducibility practices should also be mandatory. To ensure reproducibility, researchers need to make all methods and tools available, thereby reducing the risk of fraud.

Overall, if institutions adopted Open, FAIR and reproducible policies, all unified under the principle of “transparency,” research integrity would be strengthened.

Institutions should also revise their Codes of Ethics to explicitly state that any use of AI in research must be disclosed.

Again, the issue is not solely about AI. AI, when misused, is simply a tool. The underlying problem is the pressure to publish or perish, which drives researchers towards scientific misconduct.

I see the role of librarians as first and foremost raising awareness about the true nature and functioning of AI and then about the potential for misuse of AI tools, the growing problem of retractions and the unintended consequences of hypercompetitive evaluation. They also have a crucial role in promoting awareness of initiatives like COARA and in supporting COARA signatory institutions in identifying viable alternatives using open tools such as Open Alex or Open Citations.

Librarians also play a role in disseminating open science practices, such as preregistration of research, the use of electronic lab notebooks to document the entire research process and the adoption of community-based new publication workflows like Publish Review Curate (preprints + open peer review), where the key principle is again “transparency.”

The adoption of these practices can significantly enhance research integrity.

My suggestion would be to introduce mandatory courses on FAIR data management and open science, where students and faculty can learn about the “good” uses of AI and be cautioned against its improper application in research.

AI in teaching and learning is incredibly powerful (e.g., in creating personalized learning paths), so it is essential that they are exposed to the benefits of this tool from the outset.

This is a tautological question. Open science, by definition, means providing access to all research components and tools “as early and open as possible,” and I would add, “as FAIR as possible.” As Philip Stark [4] once said, science should be about “show me” and not “trust me.” If the entire process were transparent, we would not face research integrity issues. The use of (honest) AI should be disclosed in the methodologies, which in turn should be deposited in Open repositories. Our aim should be towards a FAIR AI, thereby ensuring the reproducibility and reusability of AI models and tools as well.

I hope to see a widespread application of AI for faster extraction of patterns and generation of new knowledge, addressing societal needs and global challenges. However, AI needs to be FAIR and Open, just like any other component of the research cycle, to foster research integrity and promote more robust science.

Open science can serve as a tool to cultivate trustworthy AI, as AI will be trained on high-quality data. When FAIR and Open principles are applied to AI, we will have a more equitable environment, moving away from the “black boxes” that currently exist.

Academics should transform their research culture – starting with a change in research assessment criteria – adopt a new mindset, and be open to embracing the novelty of AI, which, if used responsibly, can be a truly powerful asset.

Be critical thinkers. Never accept what is presented to them as “the way we have always done things” without questioning it. Stand up for their values. Advocate for changes in research assessment criteria, as they are often the primary victims of the current system (there is extensive literature on PhD mental health issues as well).

Be open-minded and curious about new developments. Never stop learning. Strive to understand the potential benefits of powerful tools like AI. Be creative and think outside the box – which is precisely what the current research assessment system often discourages.

Adopt Openness as a modus operandi: share your findings and be collaborative. By moving away from hypercompetition and building bridges between research teams and disciplines, you can contribute to making the world a better place.

1.

Guillaume Cabanac is an Associate Professor of Computer Science at the University of Toulouse, France. His current work on the Problematic Paper Screener contributes to the identification and reporting of algorithmically generated and fraudulent papers published by academic publishers: www.irit.fr/∼Guillaume.Cabanac/problematic-paper-screener

2.

Elisabeth Bik is the author of the Blog with the title Science Integrity Digest: https://scienceintegritydigest.com/about/

3.

CoARA is a collective of organisations committed to reforming the methods and processes by which research, researchers, and research organisations are evaluated: https://coara.eu/

4.

Guest post by Philip B. Stark in the Blog BITSS (Berkeley Initiative for Transparency in Social Sciences): www.bitss.org/science-is-show-me-not-trust-me/

or Create an Account

Close Modal
Close Modal