Skip to Main Content
Article navigation
Purpose

Sentiment lexicon is an essential resource for sentiment analysis of user reviews. By far, there is still a lack of domain sentiment lexicon with large scale and high accuracy for Chinese book reviews. This paper aims to construct a large-scale sentiment lexicon based on the ultrashort reviews of Chinese books.

Design/methodology/approach

First, large-scale ultrashort reviews of Chinese books, whose length is no more than six Chinese characters, are collected and preprocessed as candidate sentiment words. Second, non-sentiment words are filtered out through certain rules, such as part of speech rules, context rules, feature word rules and user behaviour rules. Third, the relative frequency is used to select and judge the polarity of sentiment words. Finally, the performance of the sentiment lexicon is evaluated through experiments.

Findings

This paper proposes a method of sentiment lexicon construction based on ultrashort reviews and successfully builds one for Chinese books with nearly 40,000 words based on the Douban book.

Originality/value

Compared with the idea of constructing a sentiment lexicon based on a small number of reviews, the proposed method can give full play to the advantages of data scale to build a corpus. Moreover, different from the computer segmentation method, this method helps to avoid the problems caused by immature segmentation technology and an imperfect N-gram language model.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$41.00
Rental

or Create an Account

Close Modal
Close Modal