This paper aims to explore whether artificial intelligence (AI) can accurately measure teacher and child behaviours during applied behaviour analysis (ABA) instruction. This study examined whether an AI model could correctly identify antecedents and child responses from video-recorded teaching sessions.
A multimodal large language model (Google Gemini Pro 002) analysed three instructional programmes (e.g. following directions, gross motor imitation and colour matching). The model was prompted to detect the teacher’s instruction and the child’s response, and to classify each trial as correct or incorrect. The researchers compared AI-generated data with human observer records for accuracy, sensitivity and specificity.
The model achieved moderate to high sensitivity (61%–100%) but lower specificity (18%–40%). It was more accurate with simple, clearly cued tasks and less reliable for more complex ones.
With AI-assisted data collection, behaviour analysts will be able to devote more time to direct teaching with their children. Automated data collection will also serve as a reliable foundation for timely feedback and data-based decision-making.
To the best of the authors’ knowledge, this is the first study to evaluate a large language model’s ability to measure ABA teaching interactions from video. The findings highlight AI’s potential to collect data during ABA instruction accurately.
