Table 6

Performance results of models (GPT-4o, GPT-4o-finetuning, DeepSeek-V3 and DeepSeek-R1)

TasksModels
GPT-4o BaseGPT-4o-finetuningDeepSeek-V3 baseDeepSeek-R1 base
PRF1PRF1PRF1PRF1
MNR0.7980.6370.7090.7620.7140.7370.8330.7510.7900.7690.7550.762
DIA0.8300.6720.7430.8510.6900.7620.7570.9140.8280.7460.7070.726
ETI0.3850.5000.4350.4710.8000.5930.6670.4000.5000.4670.7000.560
GMI0.5000.4380.4670.4710.5000.4850.6920.5630.6210.5000.5630.529
PROG0.4290.1770.2500.3080.2350.2670.4550.2940.3570.6250.2940.400
TREAT0.8760.6980.7770.8330.7850.8080.9180.7790.8430.8370.8370.837
ENR0.4710.3670.4120.5020.5570.5280.3730.3980.3850.5000.4300.462
FEEL0.5040.5450.5230.5530.5940.5730.3930.4630.4250.5420.5200.531
VIEW0.3590.1430.2040.4430.5100.4740.3410.3160.3280.4310.3160.365
MNCX0.7070.6080.6540.7530.7040.7270.7060.6120.6560.7010.7770.737
BACK0.6090.5060.5530.6970.6390.6670.6580.5260.5850.6940.7610.726
CON0.8670.2890.4330.5500.4680.5060.5370.4490.4890.5740.7000.631
ELA0.7790.8150.7960.8560.8360.8460.7900.7500.7700.7490.8170.781
ENCX (CAUSE)0.7070.7160.7120.8460.8460.8460.6710.6710.6710.7900.7900.790
Overall0.6810.5690.6190.7060.6920.6990.6520.6080.6290.6900.6960.693
Source(s): Authors’ own work

or Create an Account

Close Modal
Close Modal