For a given model (Gemma-3-1b-it, gemma-3-27b-It, Gemini-2.5-Flash-Lite, Gemini-2.5-Flash and Gemini-3.1-Flash-Lite) and a given benchmarking user prompt (basic, intermediate, advanced), the number of tries until the first successful syntacticalAI workflow execution is recorded. The light-gray numbers and gray numbers indicate results from the first attempt and the second attempt, respectively. An “x” indicates more than ten tries without success