An Empirical Evaluation of Machine Learning Methods and Text Classifiers for Sentiment Analysis of Online Consumer Reviews

Main Article Content

Pei Qin Lo
Sew Lai Ng
Li-xian Jiao

Abstract

This study aims to identify the best predictive model for analysing online product reviews (OPRs) in the electronics industry, with a secondary focus on leveraging unstructured customer feedback to support product improvement. Using a dataset of 9,675 Oppo mobile phone reviews, this study employs three classification models—Random Forest, Support Vector Machine (SVM) and Logistic Regression–paired with Term Frequency-Inverse Document Frequency (TF-IDF) or bidirectional encoder representation transformer (BERT) as the embedding models to analyse customer sentiment and derive actionable insights. The methodology features a comprehensive analysis pipeline that includes text preprocessing with the Natural Language Toolkit (NLTK), feature extraction using) vectorization and BERT embeddings, and sentiment prediction through various classifiers. The results indicated that BERT was the most effective, achieving the highest accuracy, precision, recall, and F1-score. This superior performance stems from the Random’s ability to handle high-dimensional, sparse data and effectively utilize the weighted word importance provided by TF-IDF, which makes it particularly well suited for sentiment classification tasks involving structured text representations. This study contributes to this field by providing an effective framework for analysing online reviews. This can help businesses understand customer needs for refining product offerings and laying the groundwork for future applications across different product categories.

Article Details

How to Cite
Pei Qin Lo, Ng, S. L., & Jiao, L.- xian. (2026). An Empirical Evaluation of Machine Learning Methods and Text Classifiers for Sentiment Analysis of Online Consumer Reviews. Journal of Informatics and Web Engineering, 5(1), 204–217. https://doi.org/10.33093/jiwe.2026.5.1.13
Section
Regular issue

References

J. Nie, Z. Zhao, L. Huang, W. Nie, and Z. Wei, “Cross-domain recommendation via user-clustering and multidimensional information fusion,” IEEE Trans. Multimedia, vol. 25, pp. 868–880, 2023, doi: 10.1109/TMM.2021.3134161.

S. Elzeheiry, W. A. Gab-Allah, N. Mekky, and M. Elmogy, “Sentiment analysis for e-commerce product reviews: Current trends and future directions,” Preprints, 2023051649, 2023, doi: 10.20944/preprints202305.1649.v1.

G. Zhao, Y. Luo, Q. Chen, and X. Qian, “Aspect-based sentiment analysis via multitask learning for online reviews,” Knowl.-Based Syst., vol. 264, p. 110326, 2023, doi: 10.1016/j.knosys.2023.110326.

H. J. Alantari, I. S. Currim, Y. Deng, and S. Singh, “An empirical comparison of machine learning methods for text-based sentiment analysis of online consumer reviews,” Int. J. Res. Marketing, vol. 39, no. 1, pp. 1–19, Mar. 2022, doi: 10.1016/j.ijresmar.2021.10.011.

Z. Wen, Y. Chen, H. Liu, and Z. Liang, “Text mining based approach for customer sentiment and product competitiveness using composite online review data,” J. Theor. Appl. Electron. Commer. Res., vol. 19, no. 3, pp. 1776–1792, 2024, doi: 10.3390/jtaer19030087.

R. Ireland, and A. Liu, “Application of data analytics for product design: Sentiment analysis of online product reviews,” CIRP J. Manuf. Sci. Technol., vol. 23, pp. 128–144, 2018, doi: 10.1016/j.cirpj.2018.06.003.

H. Huang, A. A. Zavareh, and M. B. Mustafa, “Sentiment analysis in e-commerce platforms: A review of current techniques and future directions,” IEEE Access, vol. 11, pp. 90367–90382, 2023, doi: 10.1109/ACCESS.2023.3307308.

G. K. Patra et al., “A sentiment analysis of customer product review based on machine learning techniques in e-commerce,” J. Artif. Intell. Cloud Comput., vol. 2, no. 4, pp. 1–4, 2023, doi: 10.2139/ssrn.5153849.

T. Shaik et al., “A review of the trends and challenges in adopting natural language processing methods for education feedback analysis,” IEEE Access, vol. 10, pp. 56720–56739, 2022.

H. Yakubu, and C. K. Kwong, “Forecasting the importance of product attributes using online customer reviews and Google Trends,” Technol. Forecast. Soc. Change, vol. 171, p. 120983, 2021.

A. McAfee and E. Brynjolfsson, “Big data: The management revolution,” Harvard Bus. Rev., Oct. 2012. [Online]. Available: https://hbr.org/2012/10/big-data-the-management-revolution

O. Netzer, R. Feldman, J. Goldenberg, and M. Fresko, “Mine your own business: Market-structure surveillance through text mining,” Marketing Sci., vol. 31, no. 3, pp. 521–543, 2012.

N. Archak, A. Ghose, and P. G. Ipeirotis, “Deriving the pricing power of product features by mining consumer reviews,” Manage. Sci., vol. 57, no. 8, pp. 1485–1509, 2011.

C. Manning, “Computational linguistics and deep learning,” Comput. Linguist., vol. 41, no. 4, pp. 701–707, 2016.

X. Fang, and J. Zhan, “Sentiment analysis using product review data,” J. Big Data, vol. 2, no. 5, pp. 1–14, 2015.

J. Jin, Y. Liu, P. Ji, and H. Liu, “Understanding big consumer opinion data for market-driven product design,” Int. J. Prod. Res., vol. 54, no. 10, pp. 3019–3041, 2016.

M. Hu and B. Liu, “Mining opinion features in customer reviews,” in Proc. AAAI, vol. 4, no. 4, pp. 755–760, 2004.

J. Jian, P. Ji, and C. Kwong, “What makes consumers unsatisfied with your products: Review analysis at a fine-grained level,” Eng. Appl. Artif. Intell., vol. 47, pp. 38–48, 2016.

W. Chung, and T. Tseng, “Discovering business intelligence from online product reviews: A rule-induction framework,” Expert Syst. Appl., vol. 39, no. 15, pp. 11870–11879, 2012.

L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.

N. Singh, and U. Jaiswal, “Sentiment analysis using machine learning: A comparative study,” ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J., vol. 12, pp. 26785, 2023, doi: 10.14201/adcaij.26785.

B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment classification using machine learning techniques,” in Proc. EMNLP, pp. 79–86, 2002.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. NAACL-HLT, pp. 4171–4186, 2019.

L. Xiang, “Application of an improved TF-IDF method in literary text classification,” Adv. Multimedia, vol. 2022, no. 1, p. 9285324, 2021, doi: 10.1155/2022/9285324.

F. Alzami, E. Udayanti, D. Prabowo, and R. Megantara, “Document preprocessing with TF-IDF to improve the polarity classification performance of unstructured sentiment analysis,” Kinetik: Game Technol., Inf. Syst., Comput. Netw., Comput., Electron., Control, vol. 5, no. 3, 2020, doi: 10.22219/kinetik.v5i3.1066.

E. Alzahrani, and L. Jololian, “How different text-preprocessing techniques using the BERT model affect the gender profiling of authors,” arXiv preprint arXiv:2109.13890, 2021.