The purpose of this study was to test three prompts instructing an Artificial Intelligence large language model to determine if citations were correct to see which was the most effective. Another purpose was to analyze the prompts’ quality using the concise, logical, explicit, adaptive and reflective (CLEAR) framework.
45 altered citations from 15 American Psychologist citations were tested across three prompt strategies (Verify, Correct and Fit), resulting in 135 prompts to Microsoft CoPilot. A quantitative chi-square analysis and the CLEAR framework were used to analyze the results.
The Fit prompt (“find the closest fitting citation for”) was the most effective at providing correct citations. Only the Fit prompt was concise, logical, explicit, adaptive and reflective. The Correct prompt “‘amend the citation to be factually accurate”) provoked some “corrective hallucinations,” where CoPilot hallucinated incorrect citations in providing corrections to the altered citations.
To the best of the authors’ knowledge, no one has developed a prompt similar to the “Fit” prompt. This prompt, although not perfect, has some practical value for identifying incorrect citations.
