Error decomposition rates for different ChatGPT applications
| Model | Total error | Within-group | Cross-group | O → S | S → O |
|---|---|---|---|---|---|
| GPT-4.0 | 35 | 24 (27.0%) | 11 (12.4%) | 1 (1.1%) | 10 (11.2%) |
| GPT-4.5 | 39 | 30 (33.7%) | 9 (10.1%) | 3 (3.4%) | 6 (6.7%) |
| GPT-4.5 DR | 20 | 14 (15.7%) | 6 (6.7%) | 0 (0%) | 6 (6.7%) |
| Model | Total error | Within-group | Cross-group | O → S | S → O |
|---|---|---|---|---|---|
| GPT-4.0 | 35 | 24 (27.0%) | 11 (12.4%) | 1 (1.1%) | 10 (11.2%) |
| GPT-4.5 | 39 | 30 (33.7%) | 9 (10.1%) | 3 (3.4%) | 6 (6.7%) |
| GPT-4.5 DR | 20 | 14 (15.7%) | 6 (6.7%) | 0 (0%) | 6 (6.7%) |
Note(s): Within-group error refers to misclassifications occurring within the same category level (i.e. O1–O5 misclassified as another O category or S1–S5 misclassified as another S category). Cross-group error refers to misclassifications between the two levels, i.e. O categories incorrectly classified as S categories (O → S) or S categories incorrectly classified as O categories (S → O)
Sharing content requires targeting cookies to be enabled. Please update your cookie preferences to use this feature.