Position Prediction and Talent Discovery in Football Leagues Using Performance Data
Keywords:
Machine learning, Player recruitment analytics, Football position prediction model, Similarity-based player search, 3D models, Football talent discoveryAbstract
Football has always been dependent on the subjective evaluation of scouts and coaches to find and hire players. Although these methods work to some extent, they usually have restrictions due to human biases, irregularity, and the huge volume of football data. As more data on player performance is made available, data analytics and machine learning represent a chance to introduce objectivity, consistency, and scalability in the recruitment process. This research study suggests a machine learning-based classification model along with a clustering model to classify football players in their main positional roles using statistical performance features. The research is based on the development of models that would help to differentiate among defenders, midfielders, and attackers based on their passing efficiency, contributions to defense, won duels, and attacking indicators. For data extraction, Fbref has been used as the source of data. The player-level data of the 2023-24 season of the Top 5 European Leagues has been extracted using the Python programming language. The data involved various statistical categories addressing all the areas of performance. Position labels were merged with the scraped tables to ensure accurate role mapping. This combination resulted in the creation of an entire dataset with both performance and position features. The dataset was cleaned and prepared using data preprocessing techniques, and selected features were then used in the training process. K-Means Clustering was applied to the PCA-transformed data to cluster similar players based on their playing profile. Different supervised learning algorithms have also been applied, such as Logistic Regression, Decision Tree, K-Nearest Neighbors (KNN), Random Forest, and Voting Classifier. The standard evaluation parameters are used to provide a detailed evaluation of the predictive performance. It was found that ensemble algorithms, in particular Random Forest and the Voting Classifier, performed better than single baseline models and were stronger and more reliable in positional classification. The results suggest the potential of machine learning models when recruiting players in football teams and to facilitate and aid expert judgment. This research sets up a systematic, data-driven framework that helps clubs to screen the enormous number of players effectively in a non-subjective manner.
References
D. R. Daniel Memmert, “Data Analytics in Football: Positional Data Collection, Modelling and,” Routledge. Accessed: Jan. 06, 2026. [Online]. Available: https://www.routledge.com/Data-Analytics-in-Football-Positional-Data-Collection-Modelling-and-Analysis/Memmert-Raabe/p/book/9781032532479
K. A. Morgan, R. Godasu, and E. S. Grant, “Player Position Binary Classification Model,” Proc. - 2024 3rd Int. Conf. Comput. Appl. Technol. CCAT 2024, pp. 13–17, 2024, doi: 10.1109/CCAT64370.2024.00011.
T. Cannon, “Beyond the Eye Test: Improving Football Recruitment Through The Use Of Clustering And Support Vector Machines: Data Science Report,” NORMA eResearch @NCI Libr., 2023, [Online]. Available: https://norma.ncirl.ie/6918/
E. Morgulev, O. H. Azar, and R. Lidor, “Sports analytics and the big-data era,” Int. J. Data Sci. Anal. 2018 54, vol. 5, no. 4, pp. 213–222, Jan. 2018, doi: 10.1007/S41060-017-0093-7.
J. Scott Armstrong, “Predicting Job Performance: The Moneyball Factor,” foresight . Accessed: Jan. 06, 2026. [Online]. Available: https://www.researchgate.net/publication/254416540_Predicting_Job_Performance_The_Moneyball_Factor
“Quantifying Player Profiles: The Evolution of the Full-Back – Carey Analytics.” Accessed: Jan. 06, 2026. [Online]. Available: https://careyanalytics.wordpress.com/2018/02/22/quantifying-player-profiles-the-evolution-of-the-full-back/
“A closer look into European Midfielder Playing Styles – Carey Analytics.” Accessed: Jan. 06, 2026. [Online]. Available: https://careyanalytics.wordpress.com/2020/04/30/a-closer-look-into-european-midfielder-playing-styles/
“2023-2024 Big 5 European Leagues Stats | FBref.com.” Accessed: Jan. 10, 2026. [Online]. Available: https://fbref.com/en/comps/Big5/2023-2024/2023-2024-Big-5-European-Leagues-Stats
Z. P. B. Zeng, “A Machine Learning Model to Predict Player’s Positions based on Performance,” ic Sport., 2021, [Online]. Available: https://www.scitepress.org/Papers/2021/106533/106533.pdf
Sam Brown & Abdulla Kerimov, “Identifying Current Position of a Player Using Machine Learning Approach - NHSJS.” Accessed: Jan. 07, 2026. [Online]. Available: https://nhsjs.com/2025/identifying-current-position-of-a-player-using-machine-learning-approach/#google_vignette
J. S. D. Chandra B, “Prediction of Football Player Performance Using Machine Learning Algorithm,” Researchgate, 2024, doi: 10.21203/rs.3.rs-3995768/v1.
C. T. Diego Moya, “Machine Learning Applied to Professional Football: Performance Improvement and Results Prediction,” Mach. Learn. Knowl. Extr, vol. 7, no. 3, p. 85, 2025, doi: https://doi.org/10.3390/make7030085.
U. Di Giacomo, F. Mercaldo, A. Santone, and G. Capobianco, “Machine Learning on Soccer Player Positions,” Int. J. Decis. Support Syst. Technol., vol. 14, no. 1, 2022, doi: https://doi.org/10.4018/IJDSST.286678.
P. T. Chenyao Li, Stylianos Kampakis, “Machine Learning Modeling to Evaluate the Value of Football Players,” arXiv:2207.11361, 2022, [Online]. Available: https://arxiv.org/abs/2207.11361
R. Pariath, S. Shah, A. Surve, and J. Mittal, “Player Performance Prediction in Football Game,” Proc. 2nd Int. Conf. Electron. Commun. Aerosp. Technol. ICECA 2018, pp. 1148–1153, Sep. 2018, doi: 10.1109/ICECA.2018.8474750.
S. van der Z. Michel de Haan, “Beyond Playing Positions: Categorizing Soccer Players Based on Match-Specific Running Performance Using Machine Learning,” J. Sport. Sci. Med., vol. 24, pp. 565–577, 2025, [Online]. Available: https://www.jssm.org/researchjssm-24-565.xml.xml
Y. Li, S. Zong, Y. Shen, Z. Pu, M. Á. Gómez, and Y. Cui, “Characterizing player’s playing styles based on player vectors for each playing position in the Chinese Football Super League,” J. Sports Sci., vol. 40, no. 14, pp. 1629–1640, Jul. 2022, doi: 10.1080/02640414.2022.2096771;SUBPAGE:STRING:ACCESS.
P. S. Hashir Sayeed, “A Machine Learning Framework to Scout Football Players,” NORMA eResearch @NCI Libr., 2023, [Online]. Available: https://norma.ncirl.ie/7265/
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 50SEA

This work is licensed under a Creative Commons Attribution 4.0 International License.


















