Inconsistency and intrinsic difficulty of different diffusion-based generative AI models. Classification error rate and obtained with Dual Vision Transformer –DaViT– model.
| Testing/Training | Kandinsky 2.2 | JPEG Compression 90 | JPEG Compression 80 | JPEG Compression 70 | JPEG Compression 60 | Upsampling 10% | Downsampling 10% | Denoising | Sharpening | Dernoising then sharpening |
|---|---|---|---|---|---|---|---|---|---|---|
| Kandinsky 2.2 | 1.27% | 16.29% | 22.27% | 25.59% | 27.98% | 4.08% | 6.15% | 12.39% | 4.16% | 16.86% |
| Testing/Training | Kandinsky 2.2 | JPEG Compression 90 | JPEG Compression 80 | JPEG Compression 70 | JPEG Compression 60 | Upsampling 10% | Downsampling 10% | Denoising | Sharpening | Dernoising then sharpening |
|---|---|---|---|---|---|---|---|---|---|---|
| Kandinsky 2.2 | 1.27% | 16.29% | 22.27% | 25.59% | 27.98% | 4.08% | 6.15% | 12.39% | 4.16% | 16.86% |
Sharing content requires targeting cookies to be enabled. Please update your cookie preferences to use this feature.