Table A2.

Data mining functionalities

IdTypeNameDescriptionAverage score of interest
(range 0 to 1)
1preanaimport_dataFunctionality to import raw data from a file system in the tool0.95
2indexDocument extraction and indexing functionalities to retrieve metadata from file system raw data1.00
3import_planFunctionality to import a file plan in the tool0.95
4import_recordsFunctionality to import the records (ISO 15 498) in the tool0.95
5anacontent_anonymFunctionality to anonymize the sensitive content (name, email)0.30
6meta_signFunctionality to recognize digital signature from metadata of a file0.75
7file_list_id_titleFunctionality of file path analysis. Breakdown of the path into “title-identifier” when possible. This would make it possible to detect if a classification framework is in place, and if so, to link records (ISO 15 498) folders to the file plan section0.95
8content_nerFunctionality of NER to identify dates, names, locations, emails from text content0.80
9content_class_imageFunctionality of image classification to detect hand signature, official stamp, etc.0.75
10content_class_textFunctionality of text classification to identify some type of content such as minutes, copyright, etc.0.85
11content_detect_langFunctionality of language detection0.65
12content_summaryFunctionality of automatic summarization0.60
13content_readabilityFunctionality to attribute a score of readability (easy to hard to read, such as Flesch Kindcaid or Fog)0.30
14content_link_record_planFunctionality to create a link between file plan and records0.80
15comboFunctionality to compose combination of data metrics0.95
16metric_archivFunctionality to compute archival metrics0.90
17ocrOptical character recognition functionality for text scanned document to treat them with all the text mining approaches1.00
18trans_imageFunctionality to describe automatically an image0.65
19trans_soundFunctionality of speech to text for audio and video content0.75
20name_rulesFunctionality to detect naming rules of file and directory0.80
21statcountFunctionality to count any metric of a set of documents0.90
22wordsFunctionality to extract frequent and relevant words0.80
23timeFunctionality to see the number of documents created and/or modified over time0.95
24sizeFunctionality to count the total size of a document group0.85
25searchengineFunctionalities to search in the documents any words in the text and the metadata1.00
26filterFunctionalities for filtering search with any metrics of the tool1.00
27simFunctionality to search for similar documents in a collection from one document or a group0.90
28clusterFunctionality to create several clusters of documents from a query (unsupervised classification). Could be used to identify group of documents without any previous information0.80
29learnautotext_genText generation functionality: can generate titles, records file descriptions when they are missing0.75
30classClassification functionality for some metrics that may be missing, for example, proposal of a proposed final state, type of sampling, etc. when missing0.75
31adminsampleFunctionality to make/prepare a sample for archival purposes0.90
32profileProfile management functionality. This feature offers a more personalized tool linked to certain user preferences0.90
33comboFunctionality of managing metric combinations and mapping with the archival model0.95

or Create an Account

Close Modal
Close Modal