Skip to Main Content
Article navigation
Purpose

This study aims to evaluate the information quality of tourism-related responses generated by leading generative artificial intelligence (GenAI) platforms to address the limited empirical understanding of their reliability in travel-planning contexts.

Design/methodology/approach

A data set of 4,800 responses was generated across three advanced GenAI platforms using varying prompting strategies and question types. Adopting a large language model (LLM)-as-evaluator framework, each response was rated across six information quality dimensions – accuracy, objectivity, relevance, completeness, timeliness and understandability. Nonparametric statistical analyses were used to examine performance variations.

Findings

While all AI platforms exhibited strong overall competence, information quality varied significantly across platforms, prompting strategies and question types. The results reveal that reasoning-based prompts (e.g. Chain-of-Thought) generally outperform simple or role-based prompts. Evaluative questions generally produced higher-quality outputs than factual queries.

Originality/value

This study represents one of the first large-scale empirical assessments of AI-generated tourism information. It extends information quality theory to the GenAI context by introducing the concept of conditional trustworthiness, demonstrating that AI reliability is dynamic and dependent on platform, prompt design and task characteristics. Furthermore, it establishes empirical benchmarks for prompt engineering in tourism and validates the LLM-as-evaluator paradigm for scalable, consistent content evaluation.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$39.00
Rental

or Create an Account

Close Modal
Close Modal