A summary of semi-supervised learning methods for Action Recognition. The column “Performance” presents the top-1 accuracy of the best model in each method. The percent (%) after each dataset denotes the percent of labeled data used for training. * denotes that these methods were re-implement for video domain by [46].
| Method | Description | Network | Performance | Code |
|---|---|---|---|---|
| VideoSSL [46] | Utilizing a pre-trained network on ImageNet to guide the training of the 3D CNN. | 3D ResNet-18 | 47.6 (Kinetics100 - 5%) 32.4 (UCF101 - 5%) 32.7 (HMDB51 - 40%) | None |
| TCL [95] | Proposing two types of loss including Maximize Instance Agreement and Maximize Group Agreement. | TSM ResNet-18 | 29.81 (SS-V2 - 5%) 30.28 (Kinetics400 - 5%) 93.29 (Jester - 5%) | Link |
| FitMach* [97] | The pseudo-labels from weakly-augmented data are utilized to guide the training for a strongly-augmented version of the same data. | 3D ResNet-18 | 40.5 (Kinetics100 - 5%) 27.1 (UCF101 - 5%) 32.9 (HMDB51 - 40%) | None |
| S4L* [121] | The combination of the self-supervised and semi-supervised learning method. | 3D ResNet-18 | 33.0 (Kinetics100 - 5%) 22.7 (UCF101 - 5%) 29.8 (HMDB51 - 40%) | None |
| MT* [6] | Calculating the average of model weights over training steps that helps to generate a more robust model compared to using the final weights. | 3D ResNet-18 | 27.8 (Kinetics100 - 5%) 17.5 (UCF101 - 5%) 27.2 (HMDB51 - 40%) | None |
| PL* [62] | The prediction from a sample is reused to guide itself. | 3D ResNet-18 | 27.8 (Kinetics100 - 5%) 17.6 (UCF101 - 5%) 27.3 (HMDB51 - 40%) | None |
| Method | Description | Network | Performance | Code |
|---|---|---|---|---|
| VideoSSL [ | Utilizing a pre-trained network on ImageNet to guide the training of the 3D CNN. | 3D ResNet-18 | 47.6 (Kinetics100 - 5%) 32.4 (UCF101 - 5%) 32.7 (HMDB51 - 40%) | None |
| TCL [ | Proposing two types of loss including Maximize Instance Agreement and Maximize Group Agreement. | TSM ResNet-18 | 29.81 (SS-V2 - 5%) 30.28 (Kinetics400 - 5%) 93.29 (Jester - 5%) | |
| FitMach* [ | The pseudo-labels from weakly-augmented data are utilized to guide the training for a strongly-augmented version of the same data. | 3D ResNet-18 | 40.5 (Kinetics100 - 5%) 27.1 (UCF101 - 5%) 32.9 (HMDB51 - 40%) | None |
| S4L* [ | The combination of the self-supervised and semi-supervised learning method. | 3D ResNet-18 | 33.0 (Kinetics100 - 5%) 22.7 (UCF101 - 5%) 29.8 (HMDB51 - 40%) | None |
| MT* [ | Calculating the average of model weights over training steps that helps to generate a more robust model compared to using the final weights. | 3D ResNet-18 | 27.8 (Kinetics100 - 5%) 17.5 (UCF101 - 5%) 27.2 (HMDB51 - 40%) | None |
| PL* [ | The prediction from a sample is reused to guide itself. | 3D ResNet-18 | 27.8 (Kinetics100 - 5%) 17.6 (UCF101 - 5%) 27.3 (HMDB51 - 40%) | None |