Table 3

Performance of conventional and proposed SER models on the evaluation part of IEMOCAP and the preprocessed BC2013 dataset (only the annotated part) with three emotions (angry, neutral, sad). The conventional model utilized Text, Text and PSD (prosodic factors), while the proposed SER model utilized Text, PSD, and PRM (prominence) as input.

DatasetInputPrecisionRecallF1
IEMOCAPText0.5510.5620.554
 Text + PSD [29]0.6210.6180.619
 Text+PSD+PRM0.6420.6230.632
BC2013Text0.5350.4800.486
 Text + PSD [29]0.5520.5180.523
 Text+PSD+PRM0.5620.5360.543

or Create an Account

Close Modal
Close Modal