Smart combination of web measures for solving semantic similarity problems

Martinez‐Gil, Jorge; Aldana‐Montes, José F.

doi:10.1108/14684521211276000

Article navigation

Research Article| September 21 2012

Smart combination of web measures for solving semantic similarity problems

Jorge Martinez‐Gil;

Jorge Martinez‐Gil

Department of Computer Languages and Computing Sciences, University of Málaga, Málaga, Spain

Search for other works by this author on:

This Site

PubMed

Google Scholar

José F. Aldana‐Montes

Department of Computer Languages and Computing Sciences, University of Málaga, Málaga, Spain

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Publisher: Emerald Publishing

Online ISSN: 1468-4535

Print ISSN: 1468-4527

2012

Online Information Review (2012) 36 (5): 724–738.

https://doi.org/10.1108/14684521211276000

Purpose

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Design/methodology/approach

In this article, the authors propose an optimization environment to improve existing techniques that use the notion of co‐occurrence and the information available on the web to measure similarity between terms.

Findings

The experimental results using the Miller and Charles and Gracia and Mena benchmark datasets show that the proposed approach is able to outperform classic probabilistic web‐based algorithms by a wide margin.

Originality/value

This paper presents two main contributions. The authors propose a novel technique that beats classic probabilistic techniques for measuring semantic similarity between terms. This new technique consists of using not only a search engine for computing web page counts, but a smart combination of several popular web search engines. The approach is evaluated on the Miller and Charles and Gracia and Mena benchmark datasets and compared with existing probabilistic web extraction techniques.

2012

You do not currently have access to this content.

Don't already have an account? Register

Smart combination of web measures for solving semantic similarity problems

New and popular articles

Email Alerts

Cited By

Smart combination of web measures for solving semantic similarity problems

Sign in

Client Account

ICE Member Sign In

New and popular articles

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable