A Chinese sign language recognition system combining attention mechanism and acoustic sensing

Yuepeng Shi; Yansheng  Wu; Qian  Li; Junyi  Zhang

doi:10.62617/mcb793

Yuepeng Shi School of Energy and Intelligent Engineering, Henan University of Animal Husbandry and Economy, Zhengzhou 450011, China
Yansheng Wu School of Energy and Intelligent Engineering, Henan University of Animal Husbandry and Economy, Zhengzhou 450011, China
Qian Li School of Energy and Intelligent Engineering, Henan University of Animal Husbandry and Economy, Zhengzhou 450011, China
Junyi Zhang School of Energy and Intelligent Engineering, Henan University of Animal Husbandry and Economy, Zhengzhou 450011, China

DOI: https://doi.org/10.62617/mcb793

Keywords: channel impulse response; sign language gesture recognition; attention mechanism; acoustic sensing

Article ID: 793

Abstract

In recent years, with the widespread popularity of smart devices and the rapid development of communication and artificial intelligence technologies, sign language gestures that can break the communication barriers between ordinary people and those with speech and hearing impairments have received much attention. However, existing human gesture recognition methods include wearable device-based, computer vision-based and Radio Frequency (RF) signal-based. These methods have problems of being difficult to deploy, violating user privacy, and being susceptible to ambient light. Compared with the above methods, using ultrasonic signals to sense sign language gestures has the advantages of not violating user privacy and not being affected by ambient light. For that purpose, we use the built-in speaker and microphone of a smartphone to send and receive ultrasonic signals to recognize sign language gestures. In order to recognize fine-grained sign language gestures, we calculate the Channel Impulse Response (CIR) induced by the sign language action as a sign language gesture special. After that, we compute first-order differences along the time dimension of the Channel Impulse Response matrix to eliminate static path interference. Finally, a convolutional neural network containing convolutional layers, spatial attention, and channel attention is passed in order to recognize sign language gestures. The experimental results show that the scheme has a recognition accuracy of 95.2% for 12 sign language interaction gestures.

References

1. WHO. Deafness and hearing loss[EB/OL]. https://www.who.int/,2024-04-01.

2. Piskozub J, Strumillo P. Reducing the number of sensors in the data glove for recognition of static hand gestures[J]. Applied Sciences, 2022, 12(15): 7388.

3. Zheng Z, Wang Q, Yang D, et al. L-sign: Large-vocabulary sign gestures recognition system[J]. IEEE Transactions on Human-Machine Systems, 2022, 52(2): 290-301.

4. Zhang Q, Jing J Z, Wang D, et al. Wearsign: Pushing the limit of sign language translation using inertial and emg wearables[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2022, 6(1): 1-27

5. SS P D, Patil S B, Patil B S. Gesture Recognition Machine Vision Video Calling Application Using YOLOv8[C]//2023 22nd International Symposium on Communications and Information Technologies (ISCIT). IEEE, 2023: 105-109.

6. Zhu C, Yang J, Shao Z, et al. Vision based hand gesture recognition using 3D shape context[J]. IEEE/CAA Journal of Automatica Sinica, 2019, 8(9): 1600-1613.

7. Saboo S, Singha J. Vision based two-level hand tracking system for dynamic hand gestures in indoor environment[J]. Multimedia Tools and Applications, 2021, 80(13): 20579-20598.

8. Dian C, Wang D, Zhang Q, et al. Towards domain-independent complex and fine-grained gesture recognition with RFID[J]. Proceedings of the ACM on Human-Computer Interaction, 2020, 4(ISS): 1-22.

9. Cai X, Ma J, Liu W, et al. Efficient convolutional neural network for fmcw radar based hand gesture recognition[C]//Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers. 2019: 17-20.

10. Liu H, Cui K, Hu K, et al. MTransSee: Enabling environment-independent mmWave sensing based gesture recognition via transfer learning[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2022, 6(1): 1-28.

11. Du H, Li P, Zhou H, et al. Wordrecorder: Accurate acoustic-based handwriting recognition using deep learning[C]//IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 2018: 1448-1456.

12. Jin Y, Gao Y, Zhu Y, et al. Sonicasl: An acoustic-based sign language gesture recognizer using earphones[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2021, 5(2): 1-30.

13. Xu H, Wang T, Wang X, et al. SignID: Acoustic-based Identification with Single Sign Gesture[C]//2021 7th International Conference on Big Data Computing and Communications (BigCom). IEEE, 2021: 98-105

14. Ling K, Dai H, Liu Y, et al. Ultragesture: Fine-grained gesture sensing and recognition[J]. IEEE Transactions on Mobile Computing, 2020, 21(7): 2620-2636.

15. Zhang Q, Wang D, Zhao R, et al. Soundlip: Enabling word and sentence-level lip interaction for smart devices[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2021, 5(1): 1-28.

16. Yun S, Chen Y C, Zheng H, et al. Strata: Fine-grained acoustic-based device-free tracking[C]//Proceedings of the 15th annual international conference on mobile systems, applications, and services. 2017: 15-28.

17. Zhao R, Wang D, Zhang Q, et al. Smartphone-based handwritten signature verification using acoustic signals[J]. Proceedings of the ACM on human-computer interaction, 2021, 5(ISS): 1-26.

18. Chen Y, Ni T, Xu W, et al. SwipePass: Acoustic-based second-factor user authentication for smartphones[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2022, 6(3): 1-25.

19. Wang Z, Wang Y, Tian M, et al. HearFire: Indoor Fire Detection via Inaudible Acoustic Sensing[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2023, 6(4): 1-25

20. Zhang Y, Huang W H, Yang C Y, et al. Endophasia: Utilizing acoustic-based imaging for issuing contact-free silent speech commands[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2020, 4(1): 1-26

21. Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

22. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

23. Sandler M, Howard A, Zhu M, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. ArXiv. Org[J]. 1801.

24. Wang Y, Hao Z, Dang X, et al. UltrasonicGS: A highly robust gesture and sign language recognition method based on ultrasonic signals[J]. Sensors, 2023, 23(4): 1790.

25. Alyami S, Luqman H, Hammoudeh M. Isolated arabic sign language recognition using a transformer-based model and landmark keypoints[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23(1): 1-19.

A Chinese sign language recognition system combining attention mechanism and acoustic sensing

Abstract

References

Further Information

Guidelines

Contact

WhatsApp: