Skip to Main Content
Article navigation
Purpose

As the second largest city in Indonesia, Surabaya characterized by rapid economic growth and dynamic real estate markets, faces challenges in property valuation including data limitations, subjectivity and traditional methods that lead to inaccurate property pricing. To address these issues, machine learning (ML)-based methods offer the potential to provide more accurate predictions by leveraging historical data and identifying complex patterns. This study aims to analyze and evaluate the accuracy of various ML algorithms in assessing residential property prices in Surabaya.

Design/methodology/approach

An extensive data set about house prices is collected using hypertext preprocessor language (PHP) to scrap (Web scraping) from a property marketplace called Rumah123 in the capital of East Java Island, namely, Surabaya. This data is used to train and test multiple linear regression model and three popular ML models, i.e. artificial neural network (ANN), support vector machine (SVM) and classification and regression tree (CART), to predict house prices with 16 different features.

Findings

The model’s performance was evaluated using the linear correlation, mean absolute error, mean absolute percentage error and root mean squared error. The results showed that the ANN performed better than the others, both in bigger and smaller clusters. On the other hand, SVM is not recommended for predicting house prices in Surabaya due to its poor accuracy.

Research limitations/implications

Predictor importance of ANN in both clusters shows that subdistricts have less impact on the house prices, which makes some data that have the same price guessed differently by ANN, probably due to a lack of data.

Practical implications

The easiness of the proposed model will allow future users to predict house prices with different models and data sets. Alternatively, further research may implement a different model using neural network, knowing that this model works better for this kind of task.

Originality/value

To the best of the authors’ knowledge, this is the first comparison of the three ML models (ANN, SVM and CART) and linear regression when predicting house prices, and all parameters are tuned with the grid search method.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$39.00
Rental

or Create an Account

Close Modal
Close Modal