Comparison of classification accuracy (%) and number of trainable parameters (N) for the methods without and with additional few-shot unsupervised domain adaptation (UDA) where number of shots is set as 4 and 8. Length of soft prompt is fixed as 10. MoE-Tr [42] is included in the comparison. Notably, MoE-Tr requires a large-scaled trainable model and needs multiple PLMs. MSP is parameter efficient and involves only one PLM. Various domains in Amazon review dataset are evaluated.
| Domain | MoE-Tr | SP | MSP | SP (4) | MSP (4) | MSP (8) |
|---|---|---|---|---|---|---|
| Books | 90.0 | 87.5 | 88.6 | 87.9 | 88.7 | 89.0 |
| DVD | 89.3 | 86.2 | 86.9 | 87.0 | 88.5 | 88.1 |
| Electronics | 90.6 | 87.4 | 88.4 | 87.9 | 89.2 | 90.3 |
| Kitchen | 90.8 | 88.5 | 89.8 | 89.2 | 90.5 | 90.7 |
| UDA | yes | no | no | yes | yes | yes |
| N | 264M | 7.68K | 7.68K | 7.68K | 7.68K | 7.68K |
| Domain | MoE-Tr | SP | MSP | SP (4) | MSP (4) | MSP (8) |
|---|---|---|---|---|---|---|
| Books | 90.0 | 87.5 | 87.9 | 88.7 | ||
| DVD | 89.3 | 86.2 | 87.0 | 88.1 | ||
| Electronics | 90.6 | 87.4 | 87.9 | 89.2 | ||
| Kitchen | 90.8 | 88.5 | 89.2 | 90.5 | ||
| UDA | yes | no | no | yes | yes | yes |
| N | 264M | 7.68K | 7.68K | 7.68K | 7.68K | 7.68K |