Predictable by publication: discovery of early highly cited academic papers based on their own features

Tang, Xiaobo; Zhou, Heshen; Li, Shixuan

doi:10.1108/LHT-06-2022-0305

Article navigation

Research Article| February 06 2023

Predictable by publication: discovery of early highly cited academic papers based on their own features

Xiaobo Tang

0000-0001-5885-4509

;

Xiaobo Tang

School of Information Management

,

Wuhan University

, Wuhan,

China

Center for Studies of Information System

,

Wuhan University

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Heshen Zhou

0000-0003-1133-2812

;

Heshen Zhou

School of Information Management

,

Wuhan University

, Wuhan,

China

Heshen Zhou can be contacted at: zhouheshen_lw@163.com

Search for other works by this author on:

This Site

PubMed

Google Scholar

Shixuan Li

0000-0002-1879-4895

Shixuan Li

School of Safety Science and Emergency Management

,

Wuhan University of Technology

, Wuhan,

China

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Heshen Zhou can be contacted at: zhouheshen_lw@163.com

Publisher: Emerald Publishing

Received: June 22 2022

Revision Received: December 18 2022

Revision Received: January 09 2023

Accepted: January 14 2023

Online ISSN: 2054-166X

Print ISSN: 0737-8831

2023

Emerald Publishing Limited

Licensed re-use rights only

Library Hi Tech (2024) 42 (4): 1366–1384.

https://doi.org/10.1108/LHT-06-2022-0305

Purpose

Predicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.

Design/methodology/approach

This research analyzed academic papers published in the Journal of the Association for Computing Machinery (ACM) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1–3 years after publication.

Findings

Experimental results showed that early highly cited academic papers are predictable when they are first published. The authors’ prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.

Originality/value

Based on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.

2023

Emerald Publishing Limited

Licensed re-use rights only

You do not currently have access to this content.

Don't already have an account? Register

Predictable by publication: discovery of early highly cited academic papers based on their own features

Email Alerts

Cited By

Predictable by publication: discovery of early highly cited academic papers based on their own features

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Sharing Unavailable