Data mining technology for monitoring and physiological and biochemical indicators of football players in different training periods

  • Shijie Zhao College of Sports, Shenyang Normal University, Shenyang 110034, Liaoning, China
  • Xueqin Wang College of Sports and Health, Linyi University, Linyi 276000, Shandong, China
Keywords: data mining technology; football player; physiological and biochemical indicators monitoring; extreme gradient boosting; shapley additive explanations; adaptive learning rate mechanism; dynamic Monitoring System
Article ID: 985

Abstract

In the monitoring and analysis of physiological and biochemical indicators of athletes, traditional data mining (DM) technology cannot extract compelling features and laws when processing high-dimensional and complex multivariate data, and the accuracy of the analysis results is low. The lack of real-time monitoring of the dynamically changing physiological state makes it impossible to detect athletes’ overtraining or fatigue in time, which affects the training effect and the health of athletes. This paper constructs an improved XGBoost (eXtreme Gradient Boosting) model to clean and normalize the collected physiological and biochemical data, remove outliers and fill in missing values, and construct a variable set representing the characteristics of different training periods to provide high-quality input data for subsequent model analysis. This paper combines the SHAP (SHapley Additive exPlanations) method to quantify the importance of each feature, selects the variables that contribute most to the recognition of the training state to optimize the model input, reduce the model complexity, and improve the computational efficiency. Based on the original XGBoost model, the loss function can be adjusted and the adaptive learning rate mechanism can be added to enable the model better to capture the dynamic changes of physiological and biochemical indicators and improve the prediction accuracy. Combined with the prediction results of the improved model, a real-time monitoring system was designed to track the changes in the physiological state of athletes during different training periods, and to issue an alarm when abnormal trends were detected to assist coaches in adjusting training plans. The experimental results show that in the feature evaluation, three key physiological indicators, namely blood oxygen saturation, blood lactate concentration, and heart rate, are extracted, which reduces the computational complexity of the subsequent model. In the four training stages of the basic period, load period, high-intensity period and recovery period, the loss values of the XGBoost model were approximately 0.5, 0.42, 0.4 and 0.35 respectively. In the monitoring data of 4 batches of football players, with 100 players in each batch, the accuracy rate remained above 0.83 and the response time was below 2 s. The experiment proved the effectiveness of the research model in the monitoring and analysis of physiological and biochemical indicators.

References

1. Vinnіchuk Y D, Polischchuk A O, Goshovska Y V, et al. Changes in biochemical parameters and mitochondrial factor in blood of amateur athletes under influence of marathon running. Fiziol Zh, 2019, 65(5): 20-27.

2. Teleglow A, Marchewka J, Tota L, et al. Changes in the morphological, rheological, and biochemical blood indicators in triathletes. Folia Biologica (Kraków), 2020, 68(3): 107-120.

3. Alina S, Vişinescu A, Caramoci A, et al. Tracking performance in elite athletes. Medicina Sportiva: Journal of Romanian Sports Medicine Society, 2021, 17(1): 3300-3307.

4. Dorofeikov V V, Smirnov M S, Nevzorova T G, et al. Automated biochemical methods to assess muscle and myocardial damage in athletes. Theory and Practice of Physical Culture, 2021 (10): 49-51.

5. Tan X, Song M. Characteristics of physiological changes in athlete training based on the data mining algorithm. Revista Brasileira de Medicina do Esporte, 2022, 28(5): 386-389.

6. Heidari J, Beckmann J, Bertollo M, et al. Multidimensional monitoring of recovery status and implications for performance. International journal of sports physiology and performance, 2019, 14(1): 2-8.

7. Cadegiani F A, Kater C E. Basal hormones and biochemical markers as predictors of overtraining syndrome in male athletes: the EROS-BASAL study. Journal of athletic training, 2019, 54(8): 906-914.

8. Wlodarczyk M, Kusy K, Slominska E, et al. Change in lactate, ammonia, and hypoxanthine concentrations in a 1-year training cycle in highly trained athletes: applying biomarkers as tools to assess training status. The Journal of Strength & Conditioning Research, 2020, 34(2): 355-364.

9. Wlodarczyk M, Kusy K, Slominska E, et al. Changes in blood concentration of adenosine triphosphate metabolism biomarkers during incremental exercise in highly trained athletes of different sport specializations. The Journal of Strength & Conditioning Research, 2019, 33(5): 1192-1200.

10. Isacco L, Degoutte F, Ennequin G, et al. Rapid weight loss influences the physical, psychological and biological responses during a simulated competition in national judo athletes. European journal of sport science, 2020, 20(5): 580-591.

11. Moss S L, Randell R K, Burgess D, et al. Assessment of energy availability and associated risk factors in professional female soccer players. European Journal of Sport Science, 2021, 21(6): 861-870.

12. Nicolas M, Vacher P, Martinent G, et al. Monitoring stress and recovery states: Structural and external stages of the short version of the RESTQ sport in elite swimmers before championships. Journal of Sport and Health Science, 2019, 8(1): 77-88.

13. Skorski S, Mujika I, Bosquet L, et al. The temporal relationship between exercise, recovery processes, and changes in performance. International Journal of Sports Physiology and Performance, 2019, 14(8): 1015-1021.

14. Chamari K, Roussi M, Bragazzi N L, et al. Optimizing training and competition during the month of Ramadan: Recommendations for a holistic and personalized approach for the fasting athletes. Tunis Med, 2019, 97(10): 1095-1103.

15. Podrigalo L, Iermakov S, Romanenko V, et al. Psychophysiological features of athletes practicing different styles of martial arts-the comparative analysis. International Journal of Applied Exercise Physiology, 2019, 8(1): 84-91.

16. Arede J, Ferreira A P, Gonzalo-Skok O, et al. Maturational development as a key aspect in physiological performance and national-team selection in elite male basketball players. International Journal of Sports Physiology and Performance, 2019, 14(7): 902-910.

17. Souza R A, Beltran O A B, Zapata D M, et al. Heart rate variability, salivary cortisol and competitive state anxiety responses during pre-competition and pre-training moments. Biology of sport, 2019, 36(1): 39-46.

18. Walker A J, McFadden B A, Sanders D J, et al. Biomarker response to a competitive season in division I female soccer players. The Journal of Strength & Conditioning Research, 2019, 33(10): 2622-2628.

19. Kruger K, Reichel T, Zeilinger C. Role of heat shock proteins 70/90 in exercise physiology and exercise immunology and their diagnostic potential in sports. Journal of Applied Physiology, 2019, 126(4): 916-927.

20. Horta T A G, Bara Filho M G, Coimbra D R, et al. Training load, physical performance, biochemical markers, and psychological stress during a short preparatory period in Brazilian elite male volleyball players. The Journal of Strength & Conditioning Research, 2019, 33(12): 3392-3399.

21. Simsek M, Kesilmis I. Predicting athletic performance from physiological parameters using machine learning: Example of bocce ball. Journal of Sports Analytics, 2022, 8(4): 299-307.

22. Marynowicz J, Lango M, Horna D, et al. Predicting ratings of perceived exertion in youth soccer using decision tree models. Biology of sport, 2022, 39(2): 245-252.

23. Lei T, Cai Z, Hua L. Training prediction and athlete heart rate measurement based on multi-channel PPG signal and SVM algorithm. Journal of Intelligent & Fuzzy Systems, 2021, 40(4): 7497-7508.

24. Ding Y. Analyzing Athletes' Physical Performance and Trends in Athletics Competitions Using Time Series Data Mining Algorithms. Journal of Electrical Systems, 2024, 20(9s): 736-746.

25. Liu Y, Ji Y. Target recognition of sport athletes based on deep learning and convolutional neural network. Journal of Intelligent & Fuzzy Systems, 2021, 40(2): 2253-2263.

26. Song H, Montenegro-Marin C E. Secure prediction and assessment of sports injuries using deep learning based convolutional neural network. Journal of Ambient Intelligence and Humanized Computing, 2021, 12(3): 3399-3410.

27. Yigit A T, Samak B, Kaya T. An XGBoost-lasso ensemble modeling approach to football player value assessment. Journal of Intelligent & Fuzzy Systems, 2020, 39(5): 6303-6314.

28. Kim J Y, Kim J H, Kang E W, et al. The Prediction of Dry Weight for Chronic Hemodialysis Athletes Using a Machine Learning Approach: Sports Health Implications. Revista de Psicología del Deporte (Journal of Sport Psychology), 2024, 33(1): 68-82.

29. Zhao Y, Ramos M F, Li B. Integrated framework to integrate Spark-based big data analytics and for health monitoring and recommendation in sports using XGBoost algorithm. Soft Computing, 2024, 28(2): 1585-1608.

30. Hou Z, Xue Y. Sports training injury risk assessment combined with dynamic analysis algorithm. Molecular & Cellular Biomechanics, 2024, 21(3): 484-484.

31. Schober P, Mascha E J, Vetter T R. Statistics from A (agreement) to Z (z score): a guide to interpreting common measures of association, agreement, diagnostic accuracy, effect size, heterogeneity, and reliability in medical research. Anesthesia & Analgesia, 2021, 133(6): 1633-1641.

32. Henderi H, Wahyuningsih T, Rahwanto E. Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer. International Journal of Informatics and Information Systems, 2021, 4(1): 13-20.

33. Kappal S. Data normalization using median median absolute deviation MMAD based Z-score for robust predictions vs. min–max normalization. London Journal of Research in Science: Natural and Formal, 2019, 19(4): 39-44.

34. Gokhan A, Guzeller C O, Eser M T. The effect of the normalization method used in different sample sizes on the success of artificial neural network model. International journal of assessment tools in education, 2019, 6(2): 170-192.

35. Abu Alfeilat H A, Hassanat A B A, Lasassmeh O, et al. Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big data, 2019, 7(4): 221-248.

36. Cunningham P, Delany S J. K-nearest neighbour classifiers-a tutorial. ACM computing surveys (CSUR), 2021, 54(6): 1-25.

37. Meng Y, Yang N, Qian Z, et al. What makes an online review more helpful: an interpretation framework using XGBoost and SHAP values. Journal of Theoretical and Applied Electronic Commerce Research, 2020, 16(3): 466-490.

38. Hamilton R I, Papadopoulos P N. Using SHAP values and machine learning to understand trends in the transient stability limit. IEEE Transactions on Power Systems, 2023, 39(1): 1384-1397.

Published
2025-01-07
How to Cite
Zhao, S., & Wang, X. (2025). Data mining technology for monitoring and physiological and biochemical indicators of football players in different training periods. Molecular & Cellular Biomechanics, 22(1), 985. https://doi.org/10.62617/mcb985
Section
Article