Machine Learning-based Prediction of House Sale Prices in Hulu Langat

Main Article Content

Kath Moon Yap Choo
Sew Lai Ng
Da An

Abstract

House price prediction remains a complex task due to the interplay of structural, locational, and economic factors. This study proposes a machine learning-based predictive framework tailored to the residential real estate market in Hulu Langat, Malaysia, a district undergoing rapid urbanization. Using open-source datasets, three structured variations are developed: one containing only housing and locational attributes, another incorporating regional-level macroeconomic data, and a third combining both regional and national macroeconomic indicators. The proposed model (Ensemble Weighted Average) is trained and evaluated using evaluation metrics, then compared with models such as Random Forest, XGBoost, LightGBM, and Ensemble Stacking. The proposed model trained solely on housing and locational data achieved the highest accuracy, outperforming all other models across most evaluation metrics, while XGBoost achieves the fastest computation time. Models trained with the inclusion of macroeconomic indicators consistently underperforms, suggesting that macroeconomic indicators added noise to model prediction, potentially due to spatial resolution mismatches or multicollinearity. The interpretability of the best-performing model was further enhanced with SHapley Additive exPlanations (SHAP), the resulting SHAP analysis reveals that land parcel area, property type, and local housing supply are the top contributing features to model’s performance. These findings validate the effectiveness of ensemble models for localized price prediction and highlight the importance of house attributes over broader economic trends. The proposed framework yields a practical and interpretable approach to house price prediction and may assist policymakers, developers, and planners in making informed decisions.

Article Details

How to Cite
Yap Choo, K. M., Ng, S. L., & An, D. (2026). Machine Learning-based Prediction of House Sale Prices in Hulu Langat. Journal of Informatics and Web Engineering, 5(2), 110–119. https://doi.org/10.33093/jiwe.2026.5.2.7
Section
Regular issue

References

J. Montero, and G. Fernandez-Aviles, “Hedonic price model”, in Encyclopedia of Quality of Life and Well-Being Research, Springer, Dordrecht, pp. 2834-2837, 2014, doi: 10.1007/978-94-007-0753-5_1279.

K. Satoru, “Overview of major models of spatial economics,” IDE Discussion Paper, 2024. [Online]. Available: https://www.ide.go.jp/English/Research/Topics/Eco/Spatial/overview.html

D. Gale, “The law of supply and demand,” Mathematica Scandinavica, vol. 3, no. 1, pp. 155–169, 1955, doi: 10.7146/math.scand.a-10436.

N. Amit, H. Sapiri, and Z. Md Yusof, “Factors affecting housing price in Malaysia using structural equation modeling approach,” Sains Malaysiana, vol. 51, no. 12, pp. 4161-4173, 2022, doi: 10.17576/jsm-2022-5112-23.

Y. F. Chang, W. C. Choong, S. Y. Looi, W. Y. Pan, and H. L. Goh, “Analysis of housing prices in Petaling district, Malaysia using functional relationship model,” International Journal of Housing Markets and Analysis, vol. 12, no. 5, pp. 884–905, 2019, doi: 10.1108/ijhma-12-2018-0099.

S. Abdul-Rahman, N. H. Zulkifley, I. Ismail, and S. Mutalib, “Advanced machine learning algorithms for house price prediction: Case study in Kuala Lumpur,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 12, 2021, doi: 10.14569/ijacsa.2021.0121291.

E. Z. Teoh, W.-C. Yau, T. S. Ong, and T. Connie, “Explainable housing price prediction with determinant analysis,” International Journal of Housing Markets and Analysis, vol. 16, no. 5, pp. 1021–1045, 2022, doi: 10.1108/ijhma-02-2022-0025.

P.-Y. Wang, C.-T. Chen, J.-W. Su, T.-Y. Wang, and S.-H. Huang, “Deep learning model for house price prediction using heterogeneous data analysis along with joint self-attention mechanism,” IEEE Access, vol. 9, pp. 55244–55259, 2021, doi: 10.1109/ACCESS.2021.3071306.

G. C. Chow, and L. Niu, “Housing prices in urban china as determined by demand and supply,” Pacific Economic Review, vol. 20, no. 1, pp. 1–16, 2015, doi: 10.1111/1468-0106.12080.

S. N. Abd. Rahman, N. H. Adi Maimun, M. N. Mohamed Razali, and S. Ismail, “The Artificial Neural Network model (ANN) for Malaysian housing market analysis,” Planning Malaysia Journal, vol. 17, no. 9, May 2019, doi: 10.21837/pm.v17i9.581.

N. Nguyen and A. Cripps, “Predicting housing value: A Comparison of multiple regression analysis and artificial neural networks,” Journal of Real Estate Research, vol. 22, no. 3, pp. 313–336, 2001, doi: 10.1080/10835547.2001.12091068.

M. Yazdani, “Machine learning, deep learning, and hedonic methods for real estate price prediction,” ArXiv Preprint ArXiv211007151, 2021.

Q. Truong, M. Nguyen, H. Dang, and B. Mei, “Housing price prediction via improved machine learning techniques”, Procedia Computer Science, vol. 174, pp. 433-442, 2020, doi: 10.1016/j.procs.2020.06.111.

S. C. Bourassa, D. R. Haurin, J. L. Haurin, M. Hoesli, and J. Sun, “House price changes and idiosyncratic risk: The impact of property characteristics,” Real Estate Economics, vol. 37, no. 2, pp. 259–278, 2009, doi: 10.1111/j.1540-6229.2009.00242.x

S. H. Kok, N. W. Ismail, and C. Lee, “The sources of house price changes in Malaysia,” International Journal of Housing Markets and Analysis, vol. 11, no. 2, pp. 335–355, 2018, doi: 10.1108/IJHMA-04-2017-0039.

X. Ding, “Macroeconomic factors affecting housing prices: Take the United States as an example,” in Proceedings of the 2022 7th International Conference on Financial Innovation and Economic Development (ICFIED 2022), Atlantis Press, pp. 2335–2339, 2022, doi: 10.2991/aebmr.k.220307.380.

U. A. Hassan Fereidouni Gholipour, and A. H. Mohammed, “Foreign investments in real estate, economic growth and property prices: evidence from OECD countries,” Journal of Economic Policy Reform, vol. 17, no. 1, pp. 33–45, 2014, doi: 10.1080/17487870.2013.828613.

S. H. Zulkarnain, A. S. Nawi, M. A. Esquivias, and A. Husin, “Determinants of housing prices: evidence from East Coast Malaysia,” International Journal of Housing Markets and Analysis, Emerald Group Publishing Limited, vol. 18, No. 3, pp. 573-597, January. 2024, doi: 10.1108/IJHMA-10-2023-0139.

K.-C. Chiu, “A long short-term memory model for forecasting housing prices in Taiwan in the post-epidemic era through big data analytics,” Asia Pacific Management Review, vol. 29, no. 3, pp. 273–283, 2024, doi: 10.1016/j.apmrv.2023.08.002.

S. Pillaiyan, “Macroeconomic drivers of house prices in Malaysia,” Canadian Social Science, vol. 11, no. 9, pp. 119–130, 2015, doi: 10.3968/7482.

T. San Ong, “Factors affecting the price of housing in Malaysia,” Journal of Emerging Issues in Economics, Finance and Banking, vol. 1, pp. 414–429, 2013.

A. Owusu-Ansah, “A review of hedonic pricing models in housing research,” Journal of International Real Estate and Construction Studies, vol. 1, no. 1, pp. 19-38, 2011.

V. Limsombunchao, “House price prediction: hedonic price model vs. artificial neural network,” 2004.

C. Zhan, Z. Wu, Y. Liu, Z. Xie, and W. Chen, “Housing prices prediction with deep learning: An application for the real estate market in Taiwan,” in 2020 IEEE 18th International Conference on Industrial Informatics (INDIN), pp. 719–724, 2020, doi: 10.1109/INDIN45582.2020.9442244.

L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/a:1010933404324.

T. Chen, and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794, 2016, doi: 10.1145/2939672.2939785.

G. Ke et al., “LightGBM: A highly efficient gradient boosting decision tree,” Advances in Neural Information Processing Systems 30 (NIPS 2017), vol. 30, 2017.

Z.-H. Zhou, “Ensemble methods: foundations and algorithms,” CRC press, 2012, doi: 10.1201/b12207.

L. Rokach, “Ensemble-based classifiers,” Artificial Intelligence Review, vol. 33, pp. 1–39, 2009, doi: 10.1007/s10462-009-9124-7.

D. E. Farrar and R. R. Glauber, “Multicollinearity in regression analysis: The problem revisited,” The Review of Economics and Statistics, vol. 49, pp. 92–107, 1967, doi: 10.2307/1937887.

E. L. Glaeser and J. Gyourko, “The impact of building restrictions on housing affordability,” Economic Policy Review, Federal Reserve Bank of New York, issue Jun, pp. 21–39, 2003.

S. Rosen, “Hedonic prices and implicit markets: Product differentiation in pure competition,” Journal of Political Economy, vol. 82, no. 1, pp. 34–55, 1974, doi: 10.1086/260169.