Training and performance hyperparameters for benchmark CNN models
| Parameter | ResNet-18 | Deep Baseline CNN |
|---|---|---|
| Input size | 2 × 500 × 500 | 2 × 500 × 500 |
| Parameter quantity | 12.5 m | 11m |
| Batch size | 64 | 64 |
| Learning rate | 1 × 10–3 | 1 × 10–3 |
| Optimizer | Adam | Adam |
| Weight decay | 1 × 10–5 | 1 × 10–5 |
| Data augmentation | Random flips, crops | Random flips, crops |
| Epochs | 100 | 110 |
| Training time (wall-clock) | 6 h | 6.5 h |
| GPU memory peak | 8 GB | 8 GB |
| Inference latency per sample | 45 ms | 50 ms |
| List of channels in conv layers | [ 64, 128, 256, 512 ] | [ 64, 128, 256, 512 ] |
| Sizes of FC hidden layers | [ 512, 1,024, 256 ] | [ 512, 1,024, 256 ] |
| Dropout rate | 0.3 | 0.3 |
| Parameter | ResNet-18 | Deep Baseline CNN |
|---|---|---|
| Input size | 2 × 500 × 500 | 2 × 500 × 500 |
| Parameter quantity | 12.5 m | 11m |
| Batch size | 64 | 64 |
| Learning rate | 1 × 10–3 | 1 × 10–3 |
| Optimizer | Adam | Adam |
| Weight decay | 1 × 10–5 | 1 × 10–5 |
| Data augmentation | Random flips, crops | Random flips, crops |
| Epochs | 100 | 110 |
| Training time (wall-clock) | 6 h | 6.5 h |
| GPU memory peak | 8 GB | 8 GB |
| Inference latency per sample | 45 ms | 50 ms |
| List of channels in conv layers | [ 64, 128, 256, 512 ] | [ 64, 128, 256, 512 ] |
| Sizes of FC hidden layers | [ 512, 1,024, 256 ] | [ 512, 1,024, 256 ] |
| Dropout rate | 0.3 | 0.3 |
Sharing content requires targeting cookies to be enabled. Please update your cookie preferences to use this feature.