Table 2 A summary of semi-supervised...

Table 2

A summary of semi-supervised learning methods for Action Recognition. The column “Performance” presents the top-1 accuracy of the best model in each method. The percent (%) after each dataset denotes the percent of labeled data used for training. * denotes that these methods were re-implement for video domain by [46].

Method	Description	Network	Performance	Code
VideoSSL [46]	Utilizing a pre-trained network on ImageNet to guide the training of the 3D CNN.	3D ResNet-18	47.6 (Kinetics100 - 5%) 32.4 (UCF101 - 5%) 32.7 (HMDB51 - 40%)	None
TCL [95]	Proposing two types of loss including Maximize Instance Agreement and Maximize Group Agreement.	TSM ResNet-18	29.81 (SS-V2 - 5%) 30.28 (Kinetics400 - 5%) 93.29 (Jester - 5%)	Link
FitMach* [97]	The pseudo-labels from weakly-augmented data are utilized to guide the training for a strongly-augmented version of the same data.	3D ResNet-18	40.5 (Kinetics100 - 5%) 27.1 (UCF101 - 5%) 32.9 (HMDB51 - 40%)	None
S4L* [121]	The combination of the self-supervised and semi-supervised learning method.	3D ResNet-18	33.0 (Kinetics100 - 5%) 22.7 (UCF101 - 5%) 29.8 (HMDB51 - 40%)	None
MT* [6]	Calculating the average of model weights over training steps that helps to generate a more robust model compared to using the final weights.	3D ResNet-18	27.8 (Kinetics100 - 5%) 17.5 (UCF101 - 5%) 27.2 (HMDB51 - 40%)	None
PL* [62]	The prediction from a sample is reused to guide itself.	3D ResNet-18	27.8 (Kinetics100 - 5%) 17.6 (UCF101 - 5%) 27.3 (HMDB51 - 40%)	None

Method	Description	Network	Performance	Code
VideoSSL [46]	Utilizing a pre-trained network on ImageNet to guide the training of the 3D CNN.	3D ResNet-18	47.6 (Kinetics100 - 5%) 32.4 (UCF101 - 5%) 32.7 (HMDB51 - 40%)	None
TCL [95]	Proposing two types of loss including Maximize Instance Agreement and Maximize Group Agreement.	TSM ResNet-18	29.81 (SS-V2 - 5%) 30.28 (Kinetics400 - 5%) 93.29 (Jester - 5%)	Link
FitMach* [97]	The pseudo-labels from weakly-augmented data are utilized to guide the training for a strongly-augmented version of the same data.	3D ResNet-18	40.5 (Kinetics100 - 5%) 27.1 (UCF101 - 5%) 32.9 (HMDB51 - 40%)	None
S4L* [121]	The combination of the self-supervised and semi-supervised learning method.	3D ResNet-18	33.0 (Kinetics100 - 5%) 22.7 (UCF101 - 5%) 29.8 (HMDB51 - 40%)	None
MT* [6]	Calculating the average of model weights over training steps that helps to generate a more robust model compared to using the final weights.	3D ResNet-18	27.8 (Kinetics100 - 5%) 17.5 (UCF101 - 5%) 27.2 (HMDB51 - 40%)	None
PL* [62]	The prediction from a sample is reused to guide itself.	3D ResNet-18	27.8 (Kinetics100 - 5%) 17.6 (UCF101 - 5%) 27.3 (HMDB51 - 40%)	None

[ViewLarge]