Supervised learning evaluation metrics utilised in the reviewed literature
| Reference | Indicator | Definition |
|---|---|---|
| Akanbi and Zhang (2021), Hong et al. (2021), Jeon et al. (2021b), Zhao et al. (2020), Moon et al. (2022) | Precision | Precision is the fraction of relevant instances among all retrieved instances |
| Recall | Recall is the fraction of retrieved instances among all relevant instances | |
| Accuracy | Accuracy is the fraction of the sum of TP and TN divided by the total of measurements. It states how often the model is correct | |
| F1 Score | F1-score is a measure of a model’s accuracy on a dataset. It evaluates binary classification systems, classifying examples into “positive” or “negative”. The F-score is a harmonic mean of Recall and Precision |
| Reference | Indicator | Definition |
|---|---|---|
| Precision | Precision is the fraction of relevant instances among all retrieved instances | |
| Recall | Recall is the fraction of retrieved instances among all relevant instances | |
| Accuracy | Accuracy is the fraction of the sum of TP and TN divided by the total of measurements. It states how often the model is correct | |
| F1 Score | F1-score is a measure of a model’s accuracy on a dataset. It evaluates binary classification systems, classifying examples into “positive” or “negative”. The F-score is a harmonic mean of Recall and Precision |