Table 4.

Description of ML limitations

LimitationDescription of limitationSourceQualitative accounting objective
Poor interpretabilityThe limitation relates to the fact that users cannot understand how information is generated by the ML technology owing to the complexity of the ML modelAyodele (2010a), Sainani (2014), The Royal Society (2017) Verifiability and Understandability
OverfittingThe risk is that input features with little modelling benefit are included in the training data. These features may increase the sensitivity of the technology to changes in the inputs, even though they could be excluded with no disadvantages. In such instances, the ML model may be too closely linked to the training data used to train it and unable to classify other data sets appropriately. This increases the risk of misleading representationsHawkins (2004), Sculley et al. (2015), Witten et al. (2016) Relevance
It takes a long time to trainThe risk of increased training or learning times for ML models as the size and complexity of the data sets increaseGhanem (2012) Timeliness
Complex, which makes it slowThe risk of increased processing time due to the complexity of the ML model. In the case of a classification technique, for example, the model will be slow to classify dataKotsiantis (2007), Witten et al. (2016) Timeliness
Training rate may be slow depending on available labelled dataIn this instance, the training rate is impacted by the available labelled data to train the ML model. The ML model is trained using a labelled data set consisting of examples of input data and labels indicating predicted targets or output data. This type of data is not as prevalent as unstructured dataAyodele (,2010b), Castle (2018), Larsson and Segerås (2016), Marsland (2009), SMACC (2017), Zheng et al. (2017) Timeliness
Requires independent variablesAn ML technique such as Naïve Bayes requires independent variables in the data, which implies that the values of the different features of each variable do not influence one another. However, as each variable has a high number of features, it is unlikely that there are no dependencies among them. This may result in incorrect processing and outputs of the ML modelMarsland (2009), Samoil (2015) Faithful representation
Training set sensitiveIf a feature has a category which was not observed in the training data set, then a zero probability will be assigned to that category, thus resulting in the ML model not being able to make a prediction known as zero frequencyWitten et al. (2016), Samoil (2015), Larsson and Segerås (2016) Materiality and faithful representation
Computing intensiveThe risk of costs exceeding the financial benefits to the business as the ML model requires advanced data integration tools and infrastructure, which may present significant costs to the business to acquireGillion (2017), Sapp (2017) Cost-saving
Excessive outputIn the case of association rules ML models, the number of rules discovered may be excessive, which may impact the relevance of the outputKaur (2014) Relevance
Requires lots of timeML models may take a lot of time to produce outputs, and this may be due to how many times the algorithm needs to run to achieve an accurate resultWitten et al. (2016), Kaur (2014), Ayodele (2010b) Timeliness
Requires adequate dataThe risk is that the ML model is inaccurate owing to insufficient data for trainingBurrell (2016) Materiality and faithful representation
Trade-off between accuracy, which requires memory and overfittingA limitation of some ML models is that training them using large feature data sets results in more accurate predictions but requires more memory to store and has an increased risk of overfittingWitten et al. (2016), Sutton and Mccallum (2011) Relevance and faithful representation

Notes:

We have inserted the Table here for ease of review. We have kept to the standard JFRA convention in the “clean copy” version of the manuscript

or Create an Account

Close Modal
Close Modal