A STEMMING ALGORITHM FOR LATIN TEXT DATABASES

SCHINKE, ROBYN; GREENGRASS, MARK; ROBERTSON, ALEXANDER M.; WILLETT, PETER

doi:10.1108/eb026966

Article navigation

Review Article| February 01 1996

A STEMMING ALGORITHM FOR LATIN TEXT DATABASES

ROBYN SCHINKE;

ROBYN SCHINKE

Humanities Research Institute and Departments of History University of Sheffield, Western Bank, Sheffield S10 2TN

Search for other works by this author on:

This Site

PubMed

Google Scholar

MARK GREENGRASS;

MARK GREENGRASS

Humanities Research Institute and Departments of History University of Sheffield, Western Bank, Sheffield S10 2TN

Search for other works by this author on:

This Site

PubMed

Google Scholar

ALEXANDER M. ROBERTSON;

ALEXANDER M. ROBERTSON

Information Studies University of Sheffield, Western Bank, Sheffield S10 2TN

Search for other works by this author on:

This Site

PubMed

Google Scholar

PETER WILLETT

Information Studies University of Sheffield, Western Bank, Sheffield S10 2TN

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Publisher: Emerald Publishing

Online ISSN: 1758-7379

Print ISSN: 0022-0418

1996

Journal of Documentation (1996) 52 (2): 172–187.

https://doi.org/10.1108/eb026966

This paper describes the design of a stemming algorithm for searching databases of Latin text. The algorithm uses a simple longest‐match approach with some recoding but differs from most stemmers in its use of two separate suffix dictionaries (one for nouns and adjectives and one for verbs) for processing query and database words. These dictionaries and the associated stemming rules are arranged in such a way that the stemmer does not need to know the grammatical category of the word that is being stemmed. It is very easy to overstem in Latin: the stemmer developed here tends, rather, towards understemming, leaving sufficient grammatical information attached to the stems resulting from its use to enable users to pursue very specific searches for single grammatical forms of individual words.

This content is only available via PDF.

1996

You do not currently have access to this content.

Don't already have an account? Register

A STEMMING ALGORITHM FOR LATIN TEXT DATABASES

Email Alerts

Cited By

A STEMMING ALGORITHM FOR LATIN TEXT DATABASES

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable