Construction of tennis pose estimation and action recognition model based on improved ST-GCN

  • Yang Yu Sports Institute, Baotou Teacher’s College, Baotou 014030, China
Keywords: Spatial Temporal Graph Convolutional Network (ST-GCN); tennis; attitude estimation; action recognition; multi-scale dilated convolution module
Article ID: 605

Abstract

With the rapid growth of computer vision and deep learning technologies, the application of pose estimation and action recognition in sports training has become increasingly widespread. Due to factors such as complex movements, fast speed, and limb occlusion, pose estimation and action recognition in tennis face significant challenges. Therefore, this study first introduces selective dropout and pyramid region of interest pooling layer strategies in fast region convolutional neural networks. Secondly, a pose estimation algorithm based on multi-scale fusion pose residual network 50 is designed, and finally a spatiotemporal graph convolutional network model is constructed by fusing channel attention module and multi-scale dilated convolution module. The data showed that the average detection accuracy of the improved attitude residual network 50 was 70.4%, and the accuracy of object detection for small, medium, and large objects was 57.4%, 69.3%, and 79.2%, respectively. The continuous action recognition accuracy and inter action fluency detection time of the improved spatiotemporal graph convolutional network were 93.8% and 19.2 ms, respectively. When the sample size was 1000, its memory usage was 1378 MB and the running time was 32.7 ms. Experiments have shown that the improved model achieves high accuracy and robustness in tennis action recognition tasks, especially in complex scenes and limb occlusion conditions, where the model shows significant advantages. This study aims to provide an efficient and accurate motion recognition technology for tennis posture analysis and intelligent training.

References

1. Mokayed H, Quan TZ, Alkhaled L, Sivakumar V. Real-time human detection and counting system using deep learning computer vision techniques. Artif. Intell. Appl. 2023;1(4):221-229.

2. Zhang J, Gong K, Wang X, Feng J. Learning to augment poses for 3D human pose estimation in images and videos. IEEE Trans. Pattern Anal. Mach. Intell. 2023;45(8):10012-10026.

3. Zhang S, Qiang B, Yang X, Zhou M, Chen R, Chen L. Efficient pose estimation via a lightweight single-branch pose distillation network. IEEE Sens. J. 2023;23(22):27709-27719.

4. Zhang M, Zhou Z, Deng M. Cascaded hierarchical CNN for 2D hand pose estimation from a single color image. Multimed. Tools Appl. 2022;81(18):25745-25763.

5. Kong L, Pei D, He R, Huang D, Wang Y. Spatio-temporal player relation modeling for tactic recognition in sports videos. IEEE Trans. Circuits Syst. Video Technol. 2022;32(9):6086-6099.

6. Liu C, Li X, Li Q, Xue Y, Liu H, Gao Y. Robot recognizing humans intention and interacting with humans based on a multi-task model combining ST-GCN-LSTM model and YOLO model. Neurocomputing. 2021;430:174-184.

7. S. Yan, Y. Xiong, and D. Lin, "Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition," Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, pp. 12328, January, 2018, DOI:10.1609/aaai.v32i1.12328.

8. M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, "Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition," IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3595-3603, June, 2019, DOI:10.1109/CVPR.2019.00371.

9. J. Zhu and C. Zhu, "Enhancing Action Recognition with Channel Attention Modules in GCN," 2023 3rd Int. Conf. Electron. Inf. Eng. Comput. Sci. (EIECS), pp. 625-628, September, 2023, DOI:10.1109/EIECS59936.2023.10435515.

10. Keskes and R. Noumeir, "Vision-Based Fall Detection Using ST-GCN," IEEE Access, vol. 9, pp. 28224-28236, February, 2021, DOI:10.1109/ACCESS.2021.3058219.

11. Tong J, Wang F. Basketball sports posture recognition technology based on improved graph convolutional neural network. J. Adv. Comput. Intell. Intell. Inform. 2024;28(3):552-561.

12. Lovanshi M, Tiwari V. Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN. Multimed. Tools Appl. 2024;83(5):12705-12730.

13. Li Q, Wan J, Zhang W, Kweh QL. Spatial-temporal graph neural network based on node attention. Appl. Math. Nonlinear Sci. 2022;7(2):703-712.

14. Zhang J, Ye G, Tu Z, Qin Y, Qin Q, Zhang J, Liu J. A spatial attentive and temporal dilated (SATD) GCN for skeleton-based action recognition. CAAI Trans. Intell. Technol. 2022;7(1):46-55.

15. Sabo A, Mehdizadeh S, Iaboni A, Taati B. Estimating parkinsonism severity in natural gait videos of older adults with dementia. IEEE J. Biomed. Health Inform. 2022;26(5):2288-2298.

16. Wu S. Image recognition of standard actions in sports videos based on feature fusion. Trait. Signal. 2021;38(6):1801-1807.

17. Sun J, Lu L. Action recognition method in sports video shear based on fish swarm algorithm. J. Inf. Process. Syst. 2023;19(4):554-562.

18. Ren Y, Sun K. Application effect of human-computer interactive gymnastic sports action recognition system based on PTP-CNN algorithm. Int. J. Adv. Comput. Sci. Appl. 2024;15(1):136-145.

19. Barbon Junior S, Pinto A, Barroso JV, Caetano FG, Moura FA, Cunha SA, et al. Sport action mining: Dribbling recognition in soccer. Multimed. Tools Appl. 2022;81(3):4341-4364.

20. Shan S, Sun S, Dong P. Data driven intelligent action recognition and correction in sports training and teaching. Evol. Intell. 2023;16(5):1679-1687.

21. Jin G. Player target tracking and detection in football game video using edge computing and deep learning. J. Supercomput. 2022;78(7):9475-9491.

22. Niknejad N, Caro JL, Bidese-Puhl R, Bao Y, Staiger EA. Equine kinematic gait analysis using stereo videography and deep learning: stride length and stance duration estimation. J. ASABE. 2023;66(4):865-877.

23. Kolekar S, Gite S, Pradhan B, Alamril A. Predicting vehicle pose in six degrees of freedom from single image in real-world traffic environments using deep pretrained convolutional networks and modified Centernet. Int. J. Smart Sens. Intell. Syst. 2024;17(1):384-406.

24. Tej B, Bouaafia S, Hajjaji MA, Mtibaa A. AI-based smart agriculture 4.0 system for plant diseases detection in Tunisia. Signal Image Video Process. 2024;18(1):97-111.

25. Yang P, Liu Q, Wang B, Li W, Li Z, Sun M. An empirical study of fault diagnosis methods of a dissolved oxygen sensor based on ResNet-50. Int. J. Sens. Netw. 2022;39(3):205-214.

26. Xu X, Guo Y, Wang X. Human pose estimation model based on DiracNets and integral pose regression. Multimed. Tools Appl. 2023;82(23):36019-36039.

27. Ma X, Li Z, Zhang L. An improved ResNet-50 for garbage image classification. Tech. Gaz. 2022;29(5):1552-1559.

28. Dewi C, Chen RC. Combination of ResNet and spatial pyramid pooling for musical instrument identification. Cybern. Inf. Technol. 2022;22(1):104-116.

29. Mu T, Zhang C, Huang M, Ning B, and Wang J. Partitioning leakage detection in water distribution systems: a specialized deep learning framework enhanced by spatial-temporal graph convolutional networks. ACS ES&T Water. 2024;4(8):3453-3463.

30. Yang S, Li Z, Wang J, He D, Li Q, Li D. ST-GCN human action recognition based on new partition strategy. Comput. Integr. Manuf. Syst. 2023;29(12):4040-4059.

31. Wang H, Zhang R, Cheng X, Yang L. Hierarchical traffic flow prediction based on spatial-temporal graph convolutional network. IEEE Trans. Intell. Transp. Syst. 2022;23(9):16137-16147.

32. Alsawadi MS, Rio M. Skeleton split strategies for spatial temporal graph convolution networks. arXiv. 2021;23:4643-4658.

33. Tsai MF, Huang SH. Enhancing accuracy of human action recognition system using skeleton point correction method. Multimed. Tools Appl. 2022;81(5):7439-7459.

34. Wang Y, Wang W, Li Y, Jia Y, Xu Y, Ling Y, et al. An attention mechanism module with spatial perception and channel information interaction. Complex Intell. Syst. 2024;10(4):5427-5444.

Published
2024-12-06
How to Cite
Yu, Y. (2024). Construction of tennis pose estimation and action recognition model based on improved ST-GCN. Molecular & Cellular Biomechanics, 21(4), 605. https://doi.org/10.62617/mcb605
Section
Article