Skip to Main Content
Article navigation
Purpose

This paper aims to examine the semantic enrichment that can be applied to scanned library documents with the help of Artificial Intelligence (AI) and proposes an alternative-complementary enrichment approach using a crowdsourcing platform with specific characteristics to address cases of challenging scanned texts that AI cannot effectively handle.

Design/methodology/approach

The proposed platform is a web-based tool that utilizes simple and familiar JavaScript technologies behind web pages and CGI-Bin functionality using the busybox webserver, along with user-level security configurations. The system is portable in a copy-paste format, incorporating LLM-based AI technologies for error correction and utilizes optical character recognition technology (Tesseract engine and Google Vision) for text recognition. The system offers easy and fast creation of Tables of Contents (TOCs) from thematic headers mapped to selected excerpts of the scanned documents, as well as the addition of metadata.

Findings

The TOCs automatically generated from the thematic headings form a navigational map for the document, helping the user quickly locate information. The TOCs can be searchable and further editable by LLMs. TOCs can also be created for hyperlinks we add, as well as for images in the document that we annotate.

Originality/value

The article explores a cost-effective alternative-complementary crowdsourcing solution in relation to AI techniques for the semantic enrichment of scanned documents. This solution can significantly assist libraries in showcasing their scanned documents and aids users in navigating these documents and identifying areas of interest through enhanced search capabilities.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$39.00
Rental

or Create an Account

Close Modal
Close Modal