Developing A Predictive Model for Football Players’ Market Value Using Machine Learning

Main Article Content

Muhammad Afif Jazimin Idris
Sew Lai Ng

Abstract

Football is the world’s most popular sport, and evaluating the market value of players is crucial for clubs and managers in making informed decisions regarding transfers, contracts, and financial planning. This study aims to develop a predictive model to estimate the market value of football players using machine learning (ML) algorithms and real-life statistics performance data from the top five European leagues such as English Premier League, Italian Serie A, Spanish La Liga, German Bundesliga, and French Ligue 1 between the 2017/18 and 2019/20 seasons. By reviewing past research, various ML methods such as Random Forest, LightGBM, XGBoost, and Gradient Boosting Decision Tree (GBDT) are developed. Data preprocessing techniques, including data cleaning, feature selection, feature encoding, splitting, and standardization, are applied to ensure data quality and consistency. To tune the hyperparameter of the models, RandomizedSearchCV is applied alongside cross validation. The model evaluation is conducted using regression metrics such as mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R²), to determine the most accurate model. The best-performing model is further utilised to analyse the correlation between the features and market value, offering insights into the key features that significantly impact the market value for each position.

Article Details

How to Cite
Idris, M. A. J., & Ng, S. L. (2025). Developing A Predictive Model for Football Players’ Market Value Using Machine Learning. Journal of Informatics and Web Engineering, 4(3), 203–214. https://doi.org/10.33093/jiwe.2025.4.3.12
Section
Regular issue

References

I. Behravan, and S. M. Razavi, “A novel machine learning method for estimating football players” value in the transfer market,” Soft Computing, 2020, doi: 10.1007/s00500-020-05319-3.

C. Li, S. Kampakis, and P. Treleaven, “Machine learning modeling to evaluate the value of football players,” arXiv.org, 2022, doi: 10.48550/arXiv.2207.11361.

M. A. Al-Asadi, and S. Tasdemir, “Predict the value of football players using FIFA video game data and machine learning techniques,” IEEE Access, vol. 10, pp. 22631-22645, 2022, doi: 10.1109/access.2022.3154767.

G. P. K. Laros, “Predicting transfer value of professional football players based on player skills and characteristics using multiple linear regression, support vector regression, and random forest regression,” Tilburg University, 2020.

J. Almulla, and T. Alam, “Machine learning models reveal key performance metrics of football players to win matches in Qatar Stars League,” IEEE Access, vol. 8, pp. 213695–213705, 2020, doi: 10.1109/access.2020.3038601.

Q. Yi., M. Gomez-Ruano, H. Liu, S. Zhang, B. Gao, F. Wunderlich, and D. Memmert, “Evaluation of the technical performance of football players in the UEFA champions league,” International Journal of Environmental Research and Public Health, vol. 17, no. 2, pp. 604, 2020, doi: 10.3390/ijerph17020604.

W. R. Johnson, A. Mian, D. G. Lloyd, and J. A. Alderson, “On-field player workload exposure and knee injury risk monitoring via deep learning,” Journal of Biomechanics, vol. 93, pp. 185–193, 2019, doi: 10.1016/j.jbiomech.2019.07.002.

R. Tracy, H. Xia, A. Rasla, Y.-F. Wang, and A. Singh, “Graph encoding and neural network approaches for volleyball analytics: From game outcome to individual play predictions,” arXiv.org, 2023, doi: 10.48550/arXiv.2308.11142.

N. Chmait and H. Westerbeek, “Artificial Intelligence and machine learning in sport research: An introduction for non-data scientists,” Frontiers in Sports and Active Living, vol. 3, pp. 682287, 2021, doi: 10.3389/fspor.2021.682287.

H. Al-Shari, Y. A. Saleh, and Alper Odabas, “Comparison of gradient boosting decision tree algorithms for CPU performance,” Erciyes Medical Journal, vol. 37, pp. 157–168, 2021.

J. Prathuri, A. Kulkarni, A. Kamath, A. Menon, P. Dhatwalia, and D. Rishabh, "Prediction of player price in IPL auction using machine learning regression algorithms", 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), pp. 1-6, 2020, doi: 10.1109/conecct50063.2020.9198668.

A. Jana, and S. Hemalatha, “Football player performance analysis using particle swarm optimization and player value calculation using regression,” Journal of Physics: Conference Series, vol. 1911, no. 1, pp. 012011, 2021, doi: 10.1088/1742-6596/1911/1/012011.

M. Elahi, S. Pandey, and S. S. Malhi, “Market value prediction of football players,” SSRN Electronic Journal, 2024, doi: 10.2139/ssrn.4485449.

H. Lee, B. A. Tama, and M. Cha, “Prediction of football player value using bayesian ensemble approach,” arXiv.org, 2022, doi: 10.48550/arXiv.2206.13246.

Alessia, “European soccer dataset,” Kaggle, 2023. [Online]. Available: https://www.kaggle.com/datasets/alessiasimone/european-soccer-dataset-season-20172020.

N. Tamboli, “Tackling missing value in dataset,” Analytics Vidhya, 2021. [Online]. Available: https://www.analyticsvidhya.com/blog/2021/10/handling-missing-value/.

M. S. Jalani, H. Ng, T. T. V. Yap, and V. T . Goh, “Performance of Sentiment Classification on Tweets of Clothing Brands”, Journal of Informatics and Web Engineering, vol. 1, no. 1, pp. 16–22, Mar. 2022, doi: 10.33093/jiwe.2022.1.1.2.

S. B. Pinjosovsky, “Normalize data before or after split of training and testing data?,” Medium, 2023. [Online]. Available: https://medium.com/@spinjosovsky/normalize-data-before-or-after-split-of-training-and-testing-data-7b8005f81e26.

C. M. Chituru, S.-B. Ho, and I. Chai, “Diabetes Risk Prediction using Shapley Additive Explanations for Feature Engineering”, Journal of Informatics and Web Engineering, vol. 4, no. 2, pp. 18–35, Jun. 2025, doi: 10.33093/jiwe.2025.4.2.2.

C. Lee, P. Hsu, M. Cheng, J. Leu, N. Xu, and B. Kan, "Using machine learning to predict salaries of major league baseball players", Lecture Notes in Computer Science, pp. 28-33, 2021, doi: 10.1007/978-3-030-79463-7_3.