This study compares how official and user-generated TikTok videos construct a flagship Peruvian gastronomic fair through divergent visual topics, addressing the quantitative underanalysis of video content in event research.
A corpus of 120 TikTok videos (72 official, 48 user-generated) was analyzed through an large language model (LLM)-guided pipeline transforming videos into textual narratives via frame extraction and AI-assisted description. Latent Dirichlet Allocation identified three visual topics validated through Sentence-BERT embeddings. Term Frequency–Inverse Document Frequency analysis and network mapping revealed lexical and structural divergences between content sources.
The three themes represent distributional emphasis within shared vocabularies about events, rather than discrete categorical boundaries. These emerged representing differential emphases within a shared semantic space: Plating and visual presentation, Professional identity and regional heritage and Spatial branding and performance. Official accounts dominate professional legitimation, emphasizing chefs’ experience and contextualizing heritage. User-generated content dominates spatial and performative documentation, prioritizing venues, pavilions and crowds of visitors. Network analysis reveals that both sources employ radial architecture with divergent centrality allocation: official content integrates professional culinary markers in a centralized manner, while marginalizing spatial elements; user-generated content elevates experiential keywords to nuclear status.
Festival organizers might benefit from integrating visitor-prioritized spatial elements into their branding strategies to better reflect experiential priorities. The LLM-guided pipeline serves as a systematic approach to identifying alignment gaps, offering a scalable method for visual content auditing that enhances the reach of traditional manual coding and engagement metrics.
This study pioneers LLM-guided video-to-text transformation in event research, validating embedding-based techniques where semantic overlap reflects shared contexts rather than limitations. It proposes “compressed authenticity” where platform affordances produce centro-periferia networks with bounded agency through centrality reallocation.
