Accounting problem types and available ML techniques
| Description of the learning problem | Solutions to learning problem | ML techniques | Source |
|---|---|---|---|
| Process 1: Translation of manual and electronic documents into accounting information | |||
| Task 1.3 Document features extraction | |||
| Feature extraction is an important process of obtaining relevant data before the classification of images. This process can be improved using ML to perform deep feature extraction | Classification |
| Goussies, Ubalde, Fernandez, et al. (2014), Tarawneh et al. (2019) |
| Task 1.4 Document type recognition and classification | |||
| Image classification can detect the document type, which can be enhanced using ML | Classification |
| ABBYY Technologies (2017), Oquab, Bottou, Laptev, et al. (2014), Sorio (2013), Sorio, Bartoli, Davanzo, et al. (2010), Witten, Frank, Hall, et al. (2016), Khan (2019), Tarawneh et al. (2019) |
| Irregular document layout classification using NLP combined with ML to train the system to process flexible or irregular document layouts | Classification | Convolutional neural networks | Chen et al. (2015) |
| Text classification is used to classify text using statistical and semantic text analysis | Classification and Clustering | Classification:
Clustering:
| Zhang et al. (2015), ABBYY Technologies (2017), Du (2017), Desai et al. (2021) |
| Task 1.6 Validation of document data: | |||
| Validation of document information can use ML to determine whether the extracted data from the document is correctly classified | Classification |
| Larsson and Segerås (2016) |
| Removing duplicate entries and linking documents may be achieved using approximate string matching and ML for string classification | Classification |
| Amtrup, Thompson, Kilby, et al. (2015), Larsson and Segerås (2016), De Leone and Minnetti (2015), Samoil (2015), Winkler (2014) |
| Process 2: Reconciliation of financial information | |||
| Task 2.3 Matching | |||
| Matching records or record-linkage have been performed using various ML techniques | Classification |
| Chew and Robinson (2012), Samoil (2015) |
| Process 3: Preparation of management accounts | |||
| Task 3.3 Account allocation | |||
| Account allocation may be performed by incorporating ML, which learns to predict the account allocation based on probability and can recommend which accounts to post to | Classification and clustering | Classification: Naïve Bayes Clustering: K-means clustering Random forests | Bengtsson and Jansson (2015), Brady et al. (2017), SMACC (2017), Takaki and Ericson (2018) |
| Task 3.7 Report generation | |||
| Error detection in financial data and fraud detection can be performed by incorporating ML to identify irregularities in data sets | Classification; outlier detection and clustering | Classification:
Outlier detection:
| Ahmed et al. (2016), Alpar and Winkelsträter (2014), Hajek and Henriques (2017), Kokina and Davenport (2017) |
Clustering:
| |||
| Task 3.8 Report descriptions | |||
| Report descriptions may incorporate ML techniques in natural language generation technologies to enable a reasoning process to be applied to the reported data to produce required explanations in natural language | Prediction | Conditional random fields | Gardent and Perez-Beltrachini (2017), Lafferty et al. (2001), Yseop (2017) |
| Description of the learning problem | Solutions to learning problem | ML techniques | Source |
|---|---|---|---|
| Classification | Deep convolutional neural network | Goussies, Ubalde, Fernandez, | |
| Classification | Convolutional neural networks New document class: k- nearest neighbour Similar known documents: support vector machine | ||
| Classification | Convolutional neural networks | ||
| Classification and Clustering | Naïve Bayes Parallelisation MapReduce k-nearest neighbour Semi-supervised clustering | ||
| Classification | Naïve Bayes Support vector machine | ||
| Classification | Naïve Bayes Decision trees Support vector machine Artificial neural network | Amtrup, Thompson, Kilby, | |
| Classification | Naïve Bayes Decision trees Support vector machine Artificial neural network | ||
| Classification and clustering | |||
| Classification; outlier detection and clustering | Bayesian belief network and a decision table Naïve Bayes hybrid model Association rules | ||
K-means clustering Self-organising maps | |||
| Prediction | Conditional random fields | ||