Performance results of models (Qwen2.5–72B, Qwen2.5-72B-finetuning, Qwen2.5-max and Qwen3-235B)
| Tasks | Models | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Qwen2.5–72B base | Qwen2.5-72B-finetuning | Qwen2.5-max base | Qwen3-235B-A22 B base | |||||||||
| P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 | |
| MNR | 0.691 | 0.681 | 0.686 | 0.741 | 0.714 | 0.728 | 0.649 | 0.718 | 0.682 | 0.646 | 0.656 | 0.651 |
| DIA | 0.539 | 0.724 | 0.618 | 0.796 | 0.603 | 0.686 | 0.563 | 0.776 | 0.652 | 0.667 | 0.759 | 0.710 |
| ETI | 0.429 | 0.600 | 0.500 | 0.391 | 0.900 | 0.546 | 0.348 | 0.800 | 0.485 | 0.350 | 0.700 | 0.467 |
| GMI | 0.471 | 0.500 | 0.485 | 0.438 | 0.438 | 0.438 | 0.533 | 0.500 | 0.516 | 0.300 | 0.375 | 0.333 |
| PROG | 0.333 | 0.294 | 0.313 | 0.222 | 0.235 | 0.229 | 0.200 | 0.294 | 0.238 | 0.222 | 0.353 | 0.273 |
| TREAT | 0.862 | 0.727 | 0.789 | 0.864 | 0.814 | 0.838 | 0.818 | 0.756 | 0.786 | 0.806 | 0.674 | 0.734 |
| ENR | 0.382 | 0.462 | 0.418 | 0.437 | 0.453 | 0.444 | 0.405 | 0.493 | 0.445 | 0.494 | 0.520 | 0.507 |
| FEEL | 0.460 | 0.512 | 0.485 | 0.430 | 0.422 | 0.426 | 0.474 | 0.520 | 0.496 | 0.584 | 0.594 | 0.589 |
| VIEW | 0.300 | 0.398 | 0.342 | 0.444 | 0.490 | 0.466 | 0.336 | 0.460 | 0.388 | 0.389 | 0.429 | 0.408 |
| MNCX | 0.644 | 0.648 | 0.646 | 0.710 | 0.672 | 0.690 | 0.709 | 0.711 | 0.710 | 0.658 | 0.726 | 0.690 |
| BACK | 0.593 | 0.509 | 0.549 | 0.729 | 0.556 | 0.631 | 0.702 | 0.656 | 0.678 | 0.632 | 0.655 | 0.643 |
| CON | 0.431 | 0.500 | 0.463 | 0.500 | 0.467 | 0.483 | 0.419 | 0.605 | 0.495 | 0.467 | 0.636 | 0.539 |
| ELA | 0.745 | 0.835 | 0.787 | 0.742 | 0.868 | 0.800 | 0.824 | 0.795 | 0.810 | 0.750 | 0.827 | 0.787 |
| ENCX (CAUSE) | 0.705 | 0.733 | 0.718 | 0.838 | 0.795 | 0.816 | 0.771 | 0.771 | 0.771 | 0.795 | 0.809 | 0.802 |
| Overall | 0.595 | 0.624 | 0.609 | 0.634 | 0.624 | 0.629 | 0.623 | 0.671 | 0.646 | 0.633 | 0.669 | 0.651 |
| Tasks | Models | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Qwen2.5–72B base | Qwen2.5-72B-finetuning | Qwen2.5-max base | Qwen3-235B-A22 B base | |||||||||
| P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 | |
| DIA | 0.539 | 0.724 | 0.618 | 0.796 | 0.603 | 0.686 | 0.563 | 0.776 | 0.652 | 0.667 | 0.759 | 0.710 |
| ETI | 0.429 | 0.600 | 0.500 | 0.391 | 0.900 | 0.546 | 0.348 | 0.800 | 0.485 | 0.350 | 0.700 | 0.467 |
| GMI | 0.471 | 0.500 | 0.485 | 0.438 | 0.438 | 0.438 | 0.533 | 0.500 | 0.516 | 0.300 | 0.375 | 0.333 |
| PROG | 0.333 | 0.294 | 0.313 | 0.222 | 0.235 | 0.229 | 0.200 | 0.294 | 0.238 | 0.222 | 0.353 | 0.273 |
| TREAT | 0.862 | 0.727 | 0.789 | 0.864 | 0.814 | 0.838 | 0.818 | 0.756 | 0.786 | 0.806 | 0.674 | 0.734 |
| FEEL | 0.460 | 0.512 | 0.485 | 0.430 | 0.422 | 0.426 | 0.474 | 0.520 | 0.496 | 0.584 | 0.594 | |
| VIEW | 0.300 | 0.398 | 0.342 | 0.444 | 0.490 | 0.466 | 0.336 | 0.460 | 0.388 | 0.389 | 0.429 | 0.408 |
| BACK | 0.593 | 0.509 | 0.549 | 0.729 | 0.556 | 0.631 | 0.702 | 0.656 | 0.678 | 0.632 | 0.655 | 0.643 |
| CON | 0.431 | 0.500 | 0.463 | 0.500 | 0.467 | 0.483 | 0.419 | 0.605 | 0.495 | 0.467 | 0.636 | 0.539 |
| ELA | 0.745 | 0.835 | 0.787 | 0.742 | 0.868 | 0.800 | 0.824 | 0.795 | 0.810 | 0.750 | 0.827 | 0.787 |